research-article

Efficient Vector Store System for Python using Shared Memory

Authors:

Abhishek SharmaAuthors Info & Claims

AIMLSystems '22: Proceedings of the Second International Conference on AI-ML Systems

Article No.: 22, Pages 1 - 6

https://doi.org/10.1145/3564121.3564799

Published: 16 May 2023 Publication History

Abstract

Many e-commerce companies use machine learning to make customer experience better. Even within a single company, there will be generally many independent services running, each specializing in some aspect of customer experience. Since machine learning models work on abstract vectors representing users and/or items, each such service needs a way to store these vectors. A common approach nowadays is to save them in in-memory caches like Memcached. As these caches run in their own processes, and Machine Learning services generally run as Python services, there is a communication overhead involved for each request that ML service serves. One can reduce this overhead by directly storing these vectors in a Python dictionary within the service. To support concurrency and scale, a single node runs multiple instances of the same service. Thus, we also want to avoid duplicating these vectors across multiple processes.

In this paper, we propose a system to store vectors in shared memory and efficiently serve all concurrent instances of the service, without replicating the vectors themselves. We achieve up to 4.5x improvements in latency compared to Memcached. Additionally, due to availability of more memory, we can increase the number of server processes running in each node, translating into greater throughput. We also discuss the impact of the proposed method (towards increasing the throughput) in live production scenario.

References

[1]

G Mohammed Abdulla and Sumit Borar. 2017. Size recommendation system for fashion e-commerce. In KDD workshop on machine learning meets fashion.

[2]

Pankaj Agarwal, Sreekanth Vempati, and Sumit Borar. 2018. Personalizing similar product recommendations in fashion e-commerce. AI for fashion, The third international workshop on fashion and KDD (2018).

[3]

Amazon. Accessed: 2022. AWS S3. https://aws.amazon.com/s3/

[4]

Austin Appleby and Reini Urban. Accessed: 2022. SMhasher: Hash function quality and speed tests. https://github.com/rurban/smhasher

[5]

Pylibmc Authors. Accessed: 2022. pylibmc: Python client for memcached. https://github.com/lericson/pylibmc

[6]

Yann Collet. Accessed: 2022. xxHash. https://github.com/Cyan4973/xxHash/

[7]

Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Clifford Stein. 2022. Introduction to algorithms. MIT press.

Digital Library

[8]

Aerospike Developers. Accessed: 2022. Aerospike. https://aerospike.com/

[9]

Memcached Developers. Accessed: 2022. Configuring Server - Memcached. https://github.com/memcached/memcached/wiki/ConfiguringServer#unix-sockets

[10]

Memcached Developers. Accessed: 2022. Memcached. https://memcached.org

[11]

Python Developers. 2020. Bug on multiprocessing.shared_memory. https://github.com/python/cpython/issues/84140

[12]

Python Developers. Accessed: 2022. multiprocessing.shared_memory — Shared memory for direct access across processes. https://docs.python.org/3/library/multiprocessing.shared_memory.html

[13]

Python Developers. Accessed: 2022. What’s New In Python 3.8. https://docs.python.org/3/whatsnew/3.8.html

[14]

Redis Developers. Accessed: 2022. Redis. https://redis.io/

[15]

Redis Developers. Accessed: 2022. Redis benchmark. https://redis.io/docs/reference/optimization/benchmarks/

[16]

Redis Developers. Accessed: 2022. Redis data types. https://redis.io/docs/manual/data-types/

[17]

Python Wiki Editors. Accessed: 2022. Python Global Interpreter Lock. https://wiki.python.org/moin/GlobalInterpreterLock

[18]

Mihajlo Grbovic, Vladan Radosavljevic, Nemanja Djuric, Narayan Bhamidipati, Jaikit Savla, Varun Bhagwan, and Doug Sharp. 2015. E-commerce in your inbox: Product recommendations at scale. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 1809–1818.

Digital Library

[19]

Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet computing 7, 1 (2003), 76–80.

[20]

Microsoft. Accessed: 2022. Azure Blob Storage. https://azure.microsoft.com/en-in/services/storage/blobs/

[21]

Microsoft. Accessed: 2022. Ev3 Series. https://docs.microsoft.com/en-us/azure/virtual-machines/ev3-esv3-series#ev3-series

[22]

Robert Nystrom. Accessed: 2022. Hash Tables - Crafting Interpreters. https://craftinginterpreters.com/hash-tables.html

[23]

Loveperteek Singh, Shreya Singh, Sagar Arora, and Sumit Borar. 2019. One embedding to do them all. arXiv preprint arXiv:1906.12120(2019).

[24]

An Yan, Chaosheng Dong, Yan Gao, Jinmiao Fu, Tong Zhao, Yi Sun, and Julian McAuley. 2022. Personalized complementary product recommendation.

Cited By

Index Terms

Efficient Vector Store System for Python using Shared Memory
1. Computer systems organization
  1. Real-time systems
    1. Real-time system architecture
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. E-commerce infrastructure
    2. Extra-functional properties
      1. Software performance

Recommendations

Achieving High Performance in Bus-Based Shared-Memory Multiprocessors

In bus-based SMPs, cache misses and bus traffic form key obstacles to high performance. To overcome these problems, several techniques have been proposed: cache prefetching, read snarfing, software-controlled updating, and cache injection for reducing ...
An Efficient Tree Cache Coherence Protocol for Distributed Shared Memory Multiprocessors

Directory schemes have long been used to solve the cache coherence problem for large scale shared memory multiprocessors. In addition, tree-based protocols have been employed to reduce the directory size and the invalidation latency for a large degree ...
An Efficient Memory-Mapped Key-Value Store for Flash Storage
SoCC '18: Proceedings of the ACM Symposium on Cloud Computing

Persistent key-value stores have emerged as a main component in the data access path of modern data processing systems. However, they exhibit high CPU and I/O overhead. Today, due to power limitations it is important to reduce CPU overheads for data ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AIMLSystems '22: Proceedings of the Second International Conference on AI-ML Systems

October 2022

209 pages

ISBN:9781450398473

DOI:10.1145/3564121

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

AIMLSystems 2022

AIMLSystems 2022: The Second International Conference on AI-ML Systems

October 12 - 15, 2022

Bangalore, India

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
55
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)3

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents