skip to main content
10.1109/MICRO.2010.16acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
Article

Virtual Snooping: Filtering Snoops in Virtualized Multi-cores

Published: 04 December 2010 Publication History

Abstract

Virtualization has been rapidly expanding its applications in numerous server and desktop environments to improve the utilization and manageability of physical systems. Such proliferation of virtualized systems opens a new opportunity to improve the scalability of future multi-core architectures. Among the scalability bottlenecks in multi-cores, cache coherence has been a critical problem. Although snoop-based protocols have been dominating commercial multi-core designs, it has been difficult to scale them for more cores, as snooping protocols require high network bandwidth and power consumption for snooping all the caches. In this paper, we propose a novel snoop-based cache coherence protocol, called virtual snooping, for virtualized multi-core architectures. Virtual snooping exploits memory isolation across virtual machines and prevents unnecessary snoop requests from crossing the virtual machine boundaries. Each virtual machine becomes a virtual snoop domain, consisting of a subset of the cores in a system. However, in real virtualized systems, virtual machines cannot partition the cores perfectly without any data sharing across the snoop partitions. This paper investigates three factors, which break the memory isolation among virtual machines: data sharing with a hyper visor, virtual machine relocation, and content-based data sharing. In this paper, we explore the design space of virtual snooping with experiments on real virtualized systems and approximate simulations. The results show that virtual snooping can reduce snoops significantly even if virtual machines migrate frequently. We also propose mechanisms to address content-based data sharing by exploiting its read-only property.

References

[1]
A. Moshovos, G. Memik, B. Falsafi, and A. N. Choudhary, "JETTY: Filtering snoops for reduced energy consumption in SMP servers," in Proceedings of the 7th International Symposium on High-Performance Computer Architecture, 2001, pp. 85-96.
[2]
A. Moshovos, "RegionScout: Exploiting coarse grain sharing in snoop-based coherence," in Proceedings of the 32nd International Symposium on Computer Architecture, Jun. 2005, pp. 234-245.
[3]
J. F. Cantin, M. H. Lipasti, and J. E. Smith, "Improving multiprocessor performance with coarse-grain coherence tracking," in Proceedings of the 32nd Annual International Symposium on Computer Architecture, 2005, pp. 246-257.
[4]
N. Agarwal, L.-S. Peh, and N. K. Jha, "In-network coherence filtering: Snoopy coherence without broadcasts," in Proceedings of the 42nd Annual International Symposium on Microarchitecture, Dec. 2009, pp. 232-243.
[5]
M. R. Marty and M. D. Hill, "Virtual hierarchies to support server consolidation," in Proceedings of the 34th Annual International Symposium on Computer Architecture, 2007, pp. 46-56.
[6]
P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, "Xen and the art of virtualization," in Proceedings of the 19th Symposium on Operating Systems Principles, 2003, pp. 164-177.
[7]
E. Bugnion, S. Devine, K. Govil, and M. Rosenblum, "Disco: running commodity operating systems on scalable multiprocessors," ACM Transations on Computer Systems, vol. 15, no. 4, pp. 412-447, 1997.
[8]
"AMD64 virtualization codenamed pacifica technology: Secure virtual machine architecture reference manual," May 2005.
[9]
R. Uhlig, G. Neiger, D. Rodgers, A. L. Santoni, F. C. M. Martins, A. V. Anderson, S. M. Bennett, A. Kagi, F. H. Leung, and L. Smith, "Intel virtualization technology," Computer, vol. 38, no. 5, pp. 48-56, 2005.
[10]
C. A. Waldspurger, "Memory resource management in VMware ESX server," in Proceedings of the 5th Symposium on Operating Systems Design and Implementation, 2002, pp. 181-194.
[11]
D. G. Murray, S. H, and M. A. Fetterman, "Satori: Enlightened page sharing," in In Proceedings of the USENIX Annual Technical Conference, 2009.
[12]
D. Gupta, S. Lee, M. Vrable, S. Savage, A. C. Snoeren, G. Varghese, G. M. Voelker, and A. Vahdat, "Difference engine: Harnessing memory redundancy in virtual machines," in Proceedings of the 8th Conference on Opearting Systems Design and Implementation, 2008, pp. 309-322.
[13]
A. Burtsev, K. Srinivasan, P. Radhakrishnan, L. N. Bairavasundaram, K. Voruganti, and G. R. Goodson, "Fido: Fast inter-virtual-machine communication for enterprise appliances," in In Proceedings of the USENIX Annual Technical Conference, 2009.
[14]
C. Bienia, S. Kumar, J. P. Singh, and K. Li, "The PARSEC benchmark suite: Characterization and architectural implications," in Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, October 2008, pp. 72-81.
[15]
M. M. K. Martin, M. D. Hill, and D. A. Wood, "Token coherence: Decoupling performance and correctness," in Proceedings of the 30th International Symposium on Computer Architecture, Jun. 2003, pp. 182-193.
[16]
A. Garca-Guirado, R. Fernndez-Pascual, and J. M. Garca, "Virtual-GEMS: An infrastructure to simulate virtual machines," in Proceedings of the 5th Int. Workshop on Modeling, Benchmarking and Simulation, 2009.
[17]
M. Ekman, P. Stenström, and F. Dahlgren, "TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors," in Proceedings of the 2002 International Symposium on Low Power Electronics and Design, New York, NY, USA, 2002, pp. 243-246.
[18]
D. Kim, J. Ahn, J. Kim, and J. Huh, "Subspace snooping: Filtering snoops with operating system support," in Proceedings of the The Nineteenth International Conference on Parallel Architectures and Compilation Techniques, Sep. 2010, pp. 111-122.
[19]
N. D. Enright Jerger, L.-S. Peh, and M. H. Lipasti, "Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence," in Proceedings of the 2008 41st International Symposium on Microarchitecture, Washington, DC, USA, 2008, pp. 35-46.
[20]
M. M. K. Martin, P. J. Harper, D. J. Sorin, M. D. Hill, and D. A. Wood, "Using destination-set prediction to improve the latency/bandwidth tradeoff in shared memory multiprocessors," in Proceedings of the 30th Int. Symp. on Computer Architecture, Jun. 2003, pp. 206-217.
[21]
S. Rodrigo, J. Flich, J. Duato, and M. Hummel, "Efficient unicast and multicast support for cmps," in Proceedings of the 41st Annual International Symposium on Microarchitecture, 2008, pp. 364-375.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO '43: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
December 2010
542 pages
ISBN:9780769542997

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 04 December 2010

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media