skip to main content
10.1145/3600006.3613134acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Open access

Private Web Search with Tiptoe

Published: 23 October 2023 Publication History

Abstract

Tiptoe is a private web search engine that allows clients to search over hundreds of millions of documents, while revealing no information about their search query to the search engine's servers. Tiptoe's privacy guarantee is based on cryptography alone; it does not require hardware enclaves or non-colluding servers. Tiptoe uses semantic embeddings to reduce the problem of private full-text search to private nearest-neighbor search. Then, Tiptoe implements private nearest-neighbor search with a new, high-throughput protocol based on linearly homomorphic encryption. Running on a 45-server cluster, Tiptoe can privately search over 360 million web pages with 145 core-seconds of server compute, 56.9 MiB of client-server communication (74% of which occurs before the client enters its search query), and 2.7 seconds of end-to-end latency. Tiptoe's search works best on conceptual queries ("knee pain") and less well on exact string matches ("123 Main Street, New York"). On the MS MARCO search-quality benchmark, Tiptoe ranks the best-matching result in position 7.7 on average. This is worse than a state-of-the-art, non-private neural search algorithm (average rank: 2.3), but is close to the classical tf-idf algorithm (average rank: 6.7). Finally, Tiptoe is extensible: it also supports private text-to-image search and, with minor modifications, it can search over audio, code, and more.

Supplementary Material

PDF File (p396-henzinger-supp.pdf)
Supplemental material.

References

[1]
New tweets per second record, and how! https://blog.twitter.com/engineering/en_us/a/2013/new-tweets-per-second-record-and-how, accessed 17 April 2023, 2013.
[2]
Annual report of the librarian of congress. https://www.loc.gov/static/portals/about/reports-and-budgets/documents/annual-reports/fy2021.pdf, accessed 17 April 2023, 2021.
[3]
Google knowledge graph. https://en.wikipedia.org/wiki/Google_Knowledge_Graph, accessed 17 April 2023, 2023.
[4]
Ishtiyaque Ahmad, Divyakant Agrawal, Amr El Abbadi, and Trinabh Gupta. Pantheon: Private retrieval from public key-value store. Proceedings of the VLDB Endowment, 16(4):643--656, 2022.
[5]
Ishtiyaque Ahmad, Laboni Sarker, Divyakant Agrawal, Amr El Abbadi, and Trinabh Gupta. Coeus: A system for oblivious document ranking and retrieval. In Proceedings of the 28th ACM Symposium on Operating Systems Principles (SOSP), pages 672--690, Virtual conference, October 2021.
[6]
Martin Albrecht, Rachel Player, and Sam Scott. On the concrete hardness of learning with errors. Journal of Mathematical Cryptology, 9(3):169--203, 2015.
[7]
Asra Ali, Tancrède Lepoint, Sarvar Patel, Mariana Raykova, Phillipp Schoppmann, Karn Seth, and Kevin Yeo. Communication-Computation trade-offs in PIR. In Proceedings of the 30th USENIX Security Symposium, pages 1811--1828, Vancouver, Canada, August 2021.
[8]
Sebastian Angel, Hao Chen, Kim Laine, and Srinath Setty. PIR with compressed queries and amortized query processing. In Proceedings of the 39th IEEE Symposium on Security and Privacy, pages 962--979, San Francisco, CA, May 2018.
[9]
Sebastian Angel and Srinath Setty. Unobservable communication over fully untrusted infrastructure. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 551--569, Savannah, GA, November 2016.
[10]
Apache Software Foundation. Lucene, 2023. https://lucene.apache.org/.
[11]
Avi Arampatzis, George Drosatos, and Pavlos S. Efraimidis. Versatile query scrambling for private web search. Information Retrieval Journal, 18:331--358, 2015.
[12]
Daisuke Aritomo, Chiemi Watanabe, Masaki Matsubara, and Atsuyuki Morishima. A privacy-preserving similarity search scheme over encrypted word embeddings. In Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services, pages 403--412, 2019.
[13]
Michael Arrington. AOL proudly releases massive amounts of private data, August 2006. https://techcrunch.com/2006/08/06/aol-proudly-releases-massive-amounts-of-user-search-data/.
[14]
Martin Aumüller, Tobias Christiani, Rasmus Pagh, and Francesco Silvestri. Distance-sensitive hashing. In Proceedings of the 37th ACM Symposium on Principles of Database Systems (PODS), pages 89--104, Houston, TX, June 2018.
[15]
Ero Balsa, Carmela Troncoso, and Claudia Diaz. OBPWS: Obfuscation-based private web search. In Proceedings of the 33rd IEEE Symposium on Security and Privacy, pages 491--505, San Francisco, CA, May 2012.
[16]
Amos Beimel, Yuval Ishai, and Tal Malkin. Reducing the servers' computation in private information retrieval: PIR with preprocessing. Journal of Cryptology, 17(2):125--151, 2004.
[17]
Dan Boneh, Eu-Jin Goh, and Kobbi Nissim. Evaluating 2-DNF formulas on ciphertexts. In Proceedings of the 2nd IACR Theory of Cryptography Conference (TCC), pages 325--341, Cambridge, MA, February 2005.
[18]
Elette Boyle, Niv Gilboa, and Yuval Ishai. Function secret sharing: Improvements and extensions. In Proceedings of the 23rd ACM Conference on Computer and Communications Security (CCS), pages 1292--1303, Vienna, Austria, October 2016.
[19]
Zvika Brakerski, Nico Döttling, Sanjam Garg, and Giulio Malavolta. Leveraging linear decryption: Rate-1 fully-homomorphic encryption and time-lock puzzles. In Proceedings of the 17th IACR Theory of Cryptography Conference (TCC), Nuremberg, Germany, December 2019.
[20]
David Cash, Joseph Jaeger, Stanislaw Jarecki, Charanjit S Jutla, Hugo Krawczyk, Marcel-Catalin Rosu, and Michael Steiner. Dynamic searchable encryption in very-large databases: data structures and implementation. In Proceedings of the 2014 Annual Network and Distributed System Security Symposium (NDSS), pages 23--26, San Diego, CA, February 2014.
[21]
Yan-Cheng Chang and Michael Mitzenmacher. Privacy preserving keyword searches on remote encrypted data. In Proceedings of the 11th Annual International Conference on the Theory and Application of Cryptology and Information Security (ASIACRYPT), pages 442--455, Chennai, India, December 2005.
[22]
David L. Chaum. Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the ACM, 24(2), February 1981.
[23]
Guoxing Chen, Sanchuan Chen, Yuan Xiao, Yinqian Zhang, Zhiqiang Lin, and Ten H Lai. SGXPECTRE: Stealing intel secrets from SGX enclaves via speculative execution. In Proceedings of the 4th IEEE European Symposium on Security and Privacy, Stockholm, Sweden, June 2019.
[24]
Hao Chen, Ilaria Chillotti, Yihe Dong, Oxana Poburinnaya, Ilya Razenshteyn, and M Sadegh Riazi. SANNS: Scaling up secure approximate k-nearest neighbors search. In Proceedings of the 29th USENIX Security Symposium, pages 2111--2128, Virtual conference, August 2020.
[25]
Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. SPANN: Highly-efficient billion-scale approximate nearest neighbor search, 2021. https://arxiv.org/abs/2111.08566.
[26]
Zitai Chen, Georgios Vasilakis, Kit Murdock, Edward Dean, David Oswald, and Flavio D Garcia. VoltPillager: Hardware-based fault injection attacks against intel SGX enclaves using the SVID voltage scaling interface. In Proceedings of the 30th USENIX Security Symposium, Vancouver, Canada, August 2021.
[27]
Benny Chor, Niv Gilboa, and Moni Naor. Private information retrieval by keywords. Cryptology ePrint Archive, Paper 1998/003, February 1998. https://eprint.iacr.org/1998/003.
[28]
Benny Chor, Eyal Kushilevitz, Oded Goldreich, and Madhu Sudan. Private information retrieval. Journal of the ACM, 45(6):965--981, November 1998.
[29]
Henry Corrigan-Gibbs, Alexandra Henzinger, and Dmitry Kogan. Single-server private information retrieval with sublinear amortized time. In Proceedings of the 41st Annual International Conference on the Theory and Applications of Cryptographic Techniques (EURO-CRYPT), Trondheim, Norway, May 2022.
[30]
Criteo. AutoFaiss. https://github.com/criteo/autofaiss, accessed 16 April 2023, 2023.
[31]
Reza Curtmola, Juan Garay, Seny Kamara, and Rafail Ostrovsky. Searchable symmetric encryption: improved definitions and efficient constructions. Journal of Computer Security, 19(5):895--934, 2011.
[32]
Emma Dauterman, Vivian Fang, Ioannis Demertzis, Natacha Crooks, and Raluca Ada Popa. Snoopy: Surpassing the scalability bottleneck of oblivious storage. In Proceedings of the 28th ACM Symposium on Operating Systems Principles (SOSP), pages 655--671, Virtual conference, October 2021.
[33]
Emma Dauterman, Eric Feng, Ellen Luo, Raluca Ada Popa, and Ion Stoica. Dory: An encrypted search system with distributed trust. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 1101--1119, Virtual conference, November 2020.
[34]
Emma Dauterman, Mayank Rathee, Raluca Ada Popa, and Ion Stoica. Waldo: A private time-series database from function secret sharing. In Proceedings of the 43rd IEEE Symposium on Security and Privacy, San Francisco, CA, May 2022.
[35]
Ioannis Demertzis, Dimitrios Papadopoulos, Charalampos Papamanthou, and Saurabh Shintre. SEAL: Attack mitigation for encrypted databases via adjustable leakage. In Proceedings of the 29th USENIX Security Symposium, Virtual conference, August 2020.
[36]
Roger Dingledine, Nick Mathewson, and Paul Syverson. Tor: The second-generation onion router. In Proceedings of the 13th USENIX Security Symposium, pages 303--320, San Diego, CA, August 2004.
[37]
Josep Domingo-Ferrer, Agusti Solanas, and Jordi Castellà-Roca. h(k)-private information retrieval from privacy-uncooperative queryable databases. Online Information Review, 33(4):720--744, 2009.
[38]
Joshua J Engelsma, Anil K Jain, and Vishnu Naresh Boddeti. Hers: Homomorphically encrypted representation search. IEEE Transactions on Biometrics, Behavior, and Identity Science, 4(3):349--360, 2022.
[39]
Hugging Face. Semantic search with FAISS, 2023. https://huggingface.co/learn/nlp-course/chapter5/6?fw=tf.
[40]
Junfeng Fan and Frederik Vercauteren. Somewhat practical fully homomorphic encryption. Cryptology ePrint Archive, Paper 2012/144, March 2012. https://eprint.iacr.org/2012/144.
[41]
Thibault Formal, Benjamin Piwowarski, and Stéphane Clinchant. SPLADE: Sparse lexical and expansion model for first stage ranking. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 2288--2292, Virtual conference, July 2021.
[42]
Benjamin Fuller, Mayank Varia, Arkady Yerukhimovich, Emily Shen, Ariel Hamlin, Vijay Gadepally, Richard Shay, John Darby Mitchell, and Robert K Cunningham. SoK: Cryptographically protected database search. In Proceedings of the 38th IEEE Symposium on Security and Privacy, pages 172--191, San Jose, CA, May 2017.
[43]
Sanjam Garg, Payman Mohassel, and Charalampos Papamanthou. TWORAM: efficient oblivious RAM in two rounds with applications to searchable encryption. In Proceedings of the 36th Annual International Cryptology Conference (CRYPTO), pages 563--592, Santa Barbara, CA, August 2016.
[44]
Craig Gentry. Fully homomorphic encryption using ideal lattices. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), pages 169--178, Bethesda, MD, May-June 2009.
[45]
Craig Gentry and Shai Halevi. Compressible FHE with applications to PIR. In Proceedings of the 17th IACR Theory of Cryptography Conference (TCC), Nuremberg, Germany, December 2019.
[46]
Arthur Gervais, Reza Shokri, Adish Singla, Srdjan Capkun, and Vincent Lenders. Quantifying web-search privacy. In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS), pages 966--977, Scottsdale, AZ, November 2014.
[47]
Eu-Jin Goh. Secure indexes. Cryptology ePrint Archive, Paper 2003/216, October 2003. https://eprint.iacr.org/2003/216.
[48]
Oded Goldreich and Rafail Ostrovsky. Software protection and simulation on oblivious RAMs. Journal of the ACM, 43(3):431--473, May 1996.
[49]
Shafi Goldwasser and Silvio Micali. Probabilistic encryption. Journal of Computer and System Sciences, 28(2):270--299, 1984.
[50]
Google. In-depth guide to how google search works. https://developers.google.com/search/docs/fundamentals/how-search-works.
[51]
Google. How Google Search organizes information. https://www.google.com/search/howsearchworks/how-search-works/organizing-information/, accessed 17 April 2023, 2023.
[52]
Google. How our business works. https://about.google/how-our-business-works/, accessed 17 April 2023, 2023.
[53]
Matthew Green, Watson Ladd, and Ian Miers. A protocol for privately reporting ad impressions at scale. In Proceedings of the 23rd ACM Conference on Computer and Communications Security (CCS), Vienna, Austria, October 2016.
[54]
Daniel Gruss, Moritz Lipp, Michael Schwarz, Daniel Genkin, Jonas Juffinger, Sioli O'Connell, Wolfgang Schoechl, and Yuval Yarom. Another flip in the wall of Rowhammer defenses. In Proceedings of the 39th IEEE Symposium on Security and Privacy, San Francisco, CA, May 2018.
[55]
Katie Hafner. Researchers yearn to use AOL logs, but they hesitate, August 2006. https://www.nytimes.com/2006/08/23/technology/23search.html.
[56]
Alexandra Henzinger, Matthew M. Hong, Henry Corrigan-Gibbs, Sarah Meiklejohn, and Vinod Vaikuntanathan. One server for the price of two: Simple and fast single-server private information retrieval. In Proceedings of the 32nd USENIX Security Symposium, Anaheim, CA, August 2022.
[57]
Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. msmarco-distilbert-base-tas-b model. https://huggingface.co/sentence-transformers/msmarco-distilbert-base-tas-b.
[58]
Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. Efficiently teaching an effective dense retriever with balanced topic aware sampling. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 113--122, Virtual conference, July 2021.
[59]
Piotr Indyk and David Woodruff. Polylogarithmic private approximations and efficient matching. In Proceedings of the 3rd IACR Theory of Cryptography Conference (TCC), pages 245--264, New York, NY, March 2006.
[60]
Yeongjin Jang, Jaehyuk Lee, Sangho Lee, and Taesoo Kim. SGX-Bomb: Locking down the processor via Rowhammer attack. In Proceedings of the 2nd Workshop on System Software for Trusted Execution, October 2017.
[61]
Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. DiskANN: Fast accurate billion-point nearest neighbor search on a single node. Advances in Neural Information Processing Systems, 32, 2019.
[62]
Hervé Jegou, Matthijs Douze, and Jeff Johnson. Faiss: A library for efficient similarity search. https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/, March 29, 2017.
[63]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535--547, 2019.
[64]
Seny Kamara and Charalampos Papamanthou. Parallel and dynamic searchable symmetric encryption. In Proceedings of the 17th International Financial Cryptography and Data Security Conference, pages 258--274, Okinawa, Japan, April 2013.
[65]
Seny Kamara, Charalampos Papamanthou, and Tom Roeder. Dynamic searchable symmetric encryption. In Proceedings of the 19th ACM Conference on Computer and Communications Security (CCS), pages 965--976, Raleigh, NC, October 2012.
[66]
Omar Khattab and Matei Zaharia. Colbert: Efficient and effective passage search via contextualized late interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 39--48, Virtual conference, July 2020.
[67]
Eyal Kushilevitz and Rafail Ostrovsky. Replication is not needed: Single database, computationally-private information retrieval. In Proceedings of the 38th Annual IEEE Symposium on Foundations of Computer Science (FOCS), Miami Beach, FL, October 1997.
[68]
UKP Lab. Sentence transformers: Multilingual sentence, paragraph, and image embeddings using BERT & co. https://github.com/UKPLab/sentence-transformers.
[69]
Carlos Lassance and Stéphane Clinchant. An efficiency study for SPLADE models. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 2220--2226, Madrid, Spain, July 2022.
[70]
Mingyu Li, Jinhao Zhu, Tianxu Zhang, Cheng Tan, Yubin Xia, Sebastian Angel, and Haibo Chen. Bringing decentralized search to decentralized services. In Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 331--347, Virtual conference, July 2021.
[71]
Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. In-batch negatives for knowledge distillation with tightly-coupled teachers for dense retrieval. In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP), pages 163--173, 2021.
[72]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, pages 740--755, Zurich, Switzerland, September 2014.
[73]
Wei-Kai Lin, Ethan Mook, and Daniel Wichs. Doubly efficient private information retrieval and fully homomorphic RAM computation from ring LWE. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing (STOC), pages 595--608, Orlando, FL, June 2023.
[74]
Vadim Lyubashevsky, Chris Peikert, and Oded Regev. On ideal lattices and learning with errors over rings. In Proceedings of the 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), Monaco and Nice, France, May--June 2010.
[75]
Rasoul Akhavan Mahdavi, Abdulrahman Diaa, and Florian Kerschbaum. HE is all you need: Compressing FHE ciphertexts using additive HE, 2023.
[76]
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. A survey on bias and fairness in machine learning. ACM computing surveys (CSUR), 54(6):1--35, 2021.
[77]
Carlos Aguilar Melchor, Joris Barrier, Laurent Fousse, and Marc-Olivier Killijian. XPIR: Private information retrieval for everyone. In Proceedings of the 16th Privacy Enhancing Technologies Symposium, Darmstadt, Germany, July 2016.
[78]
Carlos Aguilar Melchor, Philippe Gaborit, and Javier Herranz. Additively homomorphic encryption with d-operand multiplications. In Proceedings of the 30th Annual International Cryptology Conference (CRYPTO), Santa Barbara, CA, August 2010.
[79]
Samir Jordan Menon and David J. Wu. Spiral: Fast, high-rate single-server PIR via FHE composition. In Proceedings of the 43rd IEEE Symposium on Security and Privacy, San Francisco, CA, May 2022.
[80]
Microsoft. MS MARCO document ranking leader-board. https://microsoft.github.io/MSMARCO-Document-Ranking-Submissions/leaderboard/, accessed 15 April 2023.
[81]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In Proceedings of the 1st International Conference on Learning Representations (ICLR), Scottsdale, AZ, May 2013.
[82]
Pratyush Mishra, Rishabh Poddar, Jerry Chen, Alessandro Chiesa, and Raluca Ada Popa. Oblix: An efficient oblivious search index. In Proceedings of the 39th IEEE Symposium on Security and Privacy, San Francisco, CA, May 2018.
[83]
Sonia Ben Mokhtar, Antoine Boutet, Pascal Felber, Marcelo Pasin, Rafael Pires, and Valerio Schiavoni. X-search: revisiting private web search using Intel SGX. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference, pages 198--208, 2017.
[84]
Jiaqi Mu and Pramod Viswanath. All-but-the-top: Simple and effective post-processing for word representations. In Proceedings of the 6th International Conference on Learning Representations (ICLR), Vancouver, Canada, April--May 2018.
[85]
Niklas Muennighoff, Nouamane Tazi, Loïc Magne, and Nils Reimers. MTEB: Massive text embedding benchmark, 2023. https://arxiv.org/abs/2210.07316.
[86]
Muhammad Haris Mughees, Hao Chen, and Ling Ren. OnionPIR: Response efficient single-server PIR. In Proceedings of the 28th ACM Conference on Computer and Communications Security (CCS), page 2292--2306, Virtual conference, November 2021.
[87]
Muhammad Haris Mughees and Ling Ren. Simple and practical single-server sublinear private information retrieval. Cryptology ePrint Archive, 2023.
[88]
Kit Murdock, David Oswald, Flavio D Garcia, Jo Van Bulck, Daniel Gruss, and Frank Piessens. Plundervolt: Software-based fault injection attacks against Intel SGX. In Proceedings of the 41st IEEE Symposium on Security and Privacy, San Francisco, CA, May 2020.
[89]
Mummoorthy Murugesan and Chris Clifton. Providing privacy through plausibly deniable search. In Proceedings of the 2009 SIAM International Conference on Data Mining, pages 768--779. SIAM, 2009.
[90]
Muhammad Naveed, Manoj Prabhakaran, and Carl A Gunter. Dynamic searchable encryption via blind storage. In Proceedings of the 35th IEEE Symposium on Security and Privacy, pages 639--654, San Jose, CA, May 2014.
[91]
Pandu Nayak. Understanding searches better than ever before, 2019. https://blog.google/products/search/search-language-understanding-bert/.
[92]
Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. MS MARCO: A human generated machine reading comprehension dataset. In Proceedings of the Workshop on Cognitive Computation: Integrating Neural and Symbolic Approaches (CoCo@NIPS), Barcelona, Spain, December 2016.
[93]
Helen Nissenbaum and Howe Daniel. TrackMeNot: Resisting surveillance in web search. In Lessons from the Identity Trail: Anonymity, Privacy, and Identity in a Networked Society. Oxford University Press, 2009.
[94]
Tatsuaki Okamoto and Shigenori Uchiyama. A new public-key cryptosystem as secure as factoring. In Proceedings of the 17th Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), Helsinki, Finland, May--June 1998.
[95]
Pascal Paillier. Public-key cryptosystems based on composite degree residuosity classes. In Proceedings of the 18th Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), pages 223--238, Prague, Czech Republic, May 1999.
[96]
Antonis Papadimitriou, Ranjita Bhagwan, Nishanth Chandran, Ramachandran Ramjee, Andreas Haeberlen, Harmeet Singh, Abhishek Modi, and Saikrishna Badrinarayanan. Big data analytics over encrypted datasets with Seabed. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 587--602, Savannah, GA, November 2016.
[97]
Vasilis Pappas, Fernando Krell, Binh Vo, Vladimir Kolesnikov, Tal Malkin, Seung Geol Choi, Wesley George, Angelos Keromytis, and Steve Bellovin. Blind seer: A scalable private DBMS. In Proceedings of the 35th IEEE Symposium on Security and Privacy, pages 359--374, San Jose, CA, May 2014.
[98]
Bryan Parno, Jacob R Lorch, John R Douceur, James Mickens, and Jonathan M McCune. Memoir: Practical state continuity for protected modules. In Proceedings of the 32nd IEEE Symposium on Security and Privacy, Oakland, CA, May 2011.
[99]
Sai Teja Peddinti and Nitesh Saxena. Web search query privacy: Evaluating query obfuscation and anonymizing networks. Journal of Computer Security, 22(1):155--199, 2014.
[100]
Albin Petit, Thomas Cerqueus, Antoine Boutet, Sonia Ben Mokhtar, David Coquil, Lionel Brunie, and Harald Kosch. SimAttack: private web search under fire. Journal of Internet Services and Applications, 7(1):1--17, 2016.
[101]
Albin Petit, Thomas Cerqueus, Sonia Ben Mokhtar, Lionel Brunie, and Harald Kosch. PEAS: Private, efficient and accurate web search. In Proceedings of the IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TRUSTCOM), pages 571--580, 2015.
[102]
Rafael Pires, David Goltzsche, Sonia Ben Mokhtar, Sara Bouchenak, Antoine Boutet, Pascal Felber, Rüdiger Kapitza, Marcelo Pasin, and Valerio Schiavoni. CYCLOSA: Decentralizing private web search through SGX-based browser extensions. In Proceedings of the 38th IEEE International Conference on Distributed Computing Systems (ICDCS), pages 467--477, Vienna, Austria, July 2018.
[103]
Rishabh Poddar, Tobias Boelter, and Raluca Ada Popa. Arx: an encrypted database using semantically secure encryption. In Proceedings of the 45th International Conference on Very Large Data Bases (VLDB), pages 1664--1678, Los Angeles, CA, August 2019.
[104]
Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, and Hari Balakrishnan. CryptDB: Protecting confidentiality with encrypted query processing. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP), pages 85--100, Cascais, Portugal, October 2011.
[105]
Yinian Qi and Mikhail J. Atallah. Efficient privacy-preserving k-nearest neighbor search. In Proceedings of the 28th IEEE International Conference on Distributed Computing Systems (ICDCS), pages 311--319, Beijing, China, June 2008.
[106]
Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Wayne Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering. 2020. https://arxiv.org/abs/2010.08191.
[107]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning (ICML), pages 8748--8763, Virtual conference, July 2021.
[108]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. C4 model. https://huggingface.co/datasets/c4.
[109]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer, 2019. https://arxiv.org/abs/1910.10683.
[110]
Hany Ragab, Alyssa Milburn, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. Crosstalk: Speculative data leaks across cores are real. In Proceedings of the 42nd IEEE Symposium on Security and Privacy, Virtual conference, May 2021.
[111]
David Rebollo-Monedero and Jordi Forné. Optimized query forgery for private information retrieval. IEEE Transactions on Information Theory, 56(9):4631--4642, 2010.
[112]
Oded Regev. On lattices, learning with errors, random linear codes, and cryptography. Journal of the ACM, 56(6):1--40, 2009.
[113]
Radim Řehůřek and Petr Sojka. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45--50, Valletta, Malta, May 2010. http://is.muni.cz/publication/884893/en.
[114]
Nils Reimers and Iryna Gurevych. Sentence-BERT: Sentence embeddings using siamese BERT-networks, 2019. https://arxiv.org/abs/1908.10084.
[115]
M Sadegh Riazi, Beidi Chen, Anshumali Shrivastava, Dan Wallach, and Farinaz Koushanfar. Sub-linear privacy-preserving near-neighbor search, 2016. https://arxiv.org/abs/1612.01835.
[116]
Panagiotis Rizomiliotis and Stefanos Gritzalis. ORAM based forward privacy preserving dynamic searchable symmetric encryption schemes. In Proceedings of the 2015 ACM Cloud Computing Security Workshop (CCSW), pages 65--76, Denver, CO, October 2015.
[117]
Keshav Santhanam, Omar Khattab, Christopher Potts, and Matei Zaharia. Plaid: an efficient engine for late interaction retrieval. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 1747--1756, 2022.
[118]
Sajin Sasy, Sergey Gorbunov, and Christopher W Fletcher. ZeroTrace: Oblivious memory primitives from Intel SGX. In Proceedings of the 2018 Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, February 2018.
[119]
Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, and Aran Komatsuzaki. LAION-400M: Open dataset of CLIP-filtered 400 million image-text pairs. In Proceedings of the 2021 NeurIPS Data-Centric AI Workshop, December 2021.
[120]
Michael Schwarz, Moritz Lipp, Daniel Moghimi, Jo Van Bulck, Julian Stecklina, Thomas Prescher, and Daniel Gruss. ZombieLoad: Cross-privilege-boundary data sampling. In Proceedings of the 26th ACM Conference on Computer and Communications Security (CCS), London, United Kingdom, November 2019.
[121]
Microsoft SEAL (release 4.1). https://github.com/Microsoft/SEAL, January 2023. Microsoft Research, Redmond, WA.
[122]
Sacha Servan-Schreiber, Simon Langowski, and Srinivas Devadas. Private approximate nearest neighbor search with sublinear communication. In Proceedings of the 43rd IEEE Symposium on Security and Privacy, pages 911--929, San Francisco, CA, May 2022.
[123]
Hayim Shaul, Dan Feldman, and Daniela Rus. Secure k-ish nearest neighbors classifier, 2018. https://arxiv.org/abs/1801.07301.
[124]
Dawn Xiaoding Song, David Wagner, and Adrian Perrig. Practical techniques for searches on encrypted data. In Proceedings of the 21st IEEE Symposium on Security and Privacy, pages 44--55, Oakland, CA, May 2000.
[125]
Emil Stefanov, Charalampos Papamanthou, and Elaine Shi. Practical dynamic searchable encryption with small leakage. In Proceedings of the 2014 Annual Network and Distributed System Security Symposium (NDSS), pages 72--75, San Diego, CA, February 2014.
[126]
Adrian Tang, Simha Sethumadhavan, and Salvatore Stolfo. CLKSCREW: exposing the perils of security-oblivious energy management. In Proceedings of the 26th USENIX Security Symposium, Vancouver, Canada, August 2017.
[127]
Vincent Toubiana, Arvind Narayanan, Dan Boneh, Helen Nissenbaum, and Solon Barocas. Adnostic: Privacy preserving targeted advertising. In Proceedings of the 17th Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, February--March 2010.
[128]
Stephen Tu, M. Frans Kaashoek, Samuel R. Madden, and Nickolai Zeldovich. Processing analytical queries over encrypted data. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP), Farmington, PA, November 2013.
[129]
Jo Van Bulck, Marina Minkin, Ofir Weisse, Daniel Genkin, Baris Kasikci, Frank Piessens, Mark Silberstein, Thomas F Wenisch, Yuval Yarom, and Raoul Strackx. Foreshadow: Extracting the keys to the Intel SGX kingdom with transient out-of-order execution. In Proceedings of the 27th USENIX Security Symposium, Baltimore, MD, August 2018.
[130]
Jo Van Bulck, Daniel Moghimi, Michael Schwarz, Moritz Lipp, Marina Minkin, Daniel Genkin, Yarom Yuval, Berk Sunar, Daniel Gruss, and Frank Piessens. LVI: Hijacking Transient Execution through Microarchitectural Load Value Injection. In Proceedings of the 41st IEEE Symposium on Security and Privacy, San Francisco, CA, May 2020.
[131]
Stephan Van Schaik, Alyssa Milburn, Sebastian Österlund, Pietro Frigo, Giorgi Maisuradze, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. RIDL: Rogue in-flight data load. In Proceedings of the 40th IEEE Symposium on Security and Privacy, San Francisco, CA, May 2019.
[132]
Stephan van Schaik, Marina Minkin, Andrew Kwong, Daniel Genkin, and Yuval Yarom. CacheOut: Leaking data on Intel CPUs via cache evictions. In Proceedings of the 42nd IEEE Symposium on Security and Privacy, Virtual conference, May 2021.
[133]
Frank Wang, Catherine Yun, Shafi Goldwasser, Vinod Vaikuntanathan, and Matei Zaharia. Splinter: Practical private queries on public data. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 299--313, Boston, MA, March 2017.
[134]
Xapian. Xapian. https://xapian.org/.
[135]
Peilin Yang, Hui Fang, and Jimmy Lin. Anserini: Enabling the use of Lucene for information retrieval research. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 1253--1256, Tokyo, Japan, August 2017.
[136]
Shaozhi Ye, Felix Wu, Raju Pandey, and Hao Chen. Noise injection for search privacy protection. In Proceedings of the 2009 International Conference on Computational Science and Engineering, August 2009.
[137]
Mingxun Zhou, Andrew Park, Elaine Shi, and Wenting Zheng. Piano: Extremely simple, single-server PIR with sublinear server computation. Cryptology ePrint Archive, 2023.
[138]
Jeffrey Zhu. Bing delivers its largest improvement in search experience using Azure GPUs, 2019. https://azure.microsoft.com/en-us/blog/bing-delivers-its-largest-improvement-in-search-experience-using-azure-gpus/.
[139]
Martin Zuber and Renaud Sirdey. Efficient homomorphic evaluation of k-NN classifiers. In Proceedings of the 21st Privacy Enhancing Technologies Symposium, pages 111--129, Virtual conference, July 2021.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSP '23: Proceedings of the 29th Symposium on Operating Systems Principles
October 2023
802 pages
ISBN:9798400702297
DOI:10.1145/3600006
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

  • USENIX

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2023

Check for updates

Badges

Qualifiers

  • Research-article

Funding Sources

Conference

SOSP '23
Sponsor:

Acceptance Rates

SOSP '23 Paper Acceptance Rate 43 of 232 submissions, 19%;
Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,558
  • Downloads (Last 6 weeks)134
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media