skip to main content
10.1145/2741948.2741959acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article
Open access

Hare: a file system for non-cache-coherent multicores

Published: 17 April 2015 Publication History

Abstract

Hare is a new file system that provides a POSIX-like interface on multicore processors without cache coherence. Hare allows applications on different cores to share files, directories, and file descriptors. The challenge in designing Hare is to support the shared abstractions faithfully enough to run applications that run on traditional shared-memory operating systems, with few modifications, and to do so while scaling with an increasing number of cores.
To achieve this goal, Hare must support features (such as shared file descriptors) that traditional network file systems don't support, as well as implement them in a way that scales (e.g., shard a directory across servers to allow concurrent operations in that directory). Hare achieves this goal through a combination of new protocols (including a 3-phase commit protocol to implement directory operations correctly and scalably) and leveraging properties of non-cache-coherent multiprocessors (e.g., atomic low-latency message delivery and shared DRAM).
An evaluation on a 40-core machine demonstrates that Hare can run many challenging Linux applications (including a mail server and a Linux kernel build) with minimal or no modifications. The results also show these applications achieve good scalability on Hare, and that Hare's techniques are important to achieving scalability.

Supplementary Material

MP4 File (a30-sidebyside.mp4)

References

[1]
POSIX API Specification. IEEE Std. 1003.1, 2013 Edition.
[2]
UNFS3. http://unfs3.sourceforge.net.
[3]
J. Appavoo, D. D. Silva, O. Krieger, M. Auslander, M. Ostrowski, B. Rosenburg, A. Waterland, R. W. Wisniewski, J. Xenidis, M. Stumm, and L. Soares. Experience distributing objects in an SMMP OS. ACM Transactions on Computer Systems, 25(3): 6, 2007.
[4]
A. Barak, S. Guday, and R. G. Wheeler. The MOSIX Distributed Operating System: Load Balancing for UNIX. Springer-Verlag, 1993.
[5]
A. Barbalace and B. Ravindran. Popcorn: a replicated-kernel OS based on Linux. In Proceedings of the Linux Symposium, Ottawa, Canada, July 2014.
[6]
A. Baumann, P. Barham, P. É. Dagand, T. L. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The multikernel: a new OS architecture for scalable multicore systems. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP), Big Sky, MT, Oct. 2009.
[7]
A. Baumann, C. Hawblitzel, K. Kourtis, T. Harris, and T. Roscoe. Cosh: Clear OS data sharing in an incoherent world. In Proceedings of the 2014 Conference on Timely Results in Operating Systems (TRIOS), Broomfield, CO, Oct. 2014.
[8]
N. Z. Beckmann, C. Gruenwald, III, C. R. Johnson, H. Kasture, F. Sironi, A. Agarwal, M. F. Kaashoek, and N. Zeldovich. Pika: A network service for multikernel operating systems. Technical Report MIT-CSAIL-TR-2014-002, MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, Jan. 2014.
[9]
E. Bugnion, S. Devine, K. Govil, and M. Rosenblum. Disco: Running commodity operating systems on scalable multiprocessors. ACM Transactions on Computer Systems, 15(4): 412--447, Nov. 1997.
[10]
J. Chapin, M. Rosenblum, S. Devine, T. Lahiri, D. Teodosiu, and A. Gupta. Hive: Fault containment for shared-memory multiprocessors. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (SOSP), pages 12--25, Copper Mountain, CO, Dec. 1995.
[11]
A. T. Clements, M. F. Kaashoek, N. Zeldovich, R. T. Morris, and E. Kohler. The scalable commutativity rule: Designing scalable software for multicore processors. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP), pages 1--17, Farmington, PA, Nov. 2013.
[12]
Cluster Filesystems, Inc. Lustre: A scalable, high-performance file system. http://www.cse.buffalo.edu/faculty/tkosar/cse710/papers/lustre-whitepaper.pdf, 2002.
[13]
J. R. Douceur and J. Howell. Distributed directory service in the Farsite file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI), Seattle, WA, Nov. 2006.
[14]
F. Douglis and J. Ousterhout. Transparent process migration: Design alternatives and the Sprite implementation. Software: Practice and Experience, 21(8): 757--785, July 1991.
[15]
M. E. Fiuczynski, R. P. Martin, B. N. Bershad, and D. E. Culler. SPINE: An operating system for intelligent network adapters. In Proceedings of the 8th ACM SIGOPS European Workshop, 1998.
[16]
B. Gamsa, O. Krieger, J. Appavoo, and M. Stumm. Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (OSDI), pages 87--100, New Orleans, LA, Feb. 1999.
[17]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP), Bolton Landing, NY, Oct. 2003.
[18]
M. Gschwind, H. P. Hofstee, B. Flachs, M. Hopkins, Y. Watanabe, and T. Yamazaki. Synergistic processing in Cell's multicore architecture. IEEE Micro, 26(2): 10--24, Mar. 2006.
[19]
J. Howard, S. Dighe, S. R. Vangal, G. Ruhl, N. Borkar, S. Jain, V. Erraguntla, M. Konow, M. Riepen, M. Gries, G. Droege, T. Lund-Larsen, S. Steibl, S. Borkar, V. K. De, and R. F. V. der Wijngaart. A 48-core IA-32 processor in 45 nm CMOS using on-die message-passing and DVFS for performance and power scaling. J. Solid-State Circuits, 46(1), 2011.
[20]
J. H. Howard, M. L. Kazar, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N. Sidebotham, and M. J. West. Scale and performance in a distributed file system. ACM Transactions on Computer Systems, 6(1): 51--81, Feb. 1988.
[21]
T. Instruments. OMAP4 applications processor: Technical reference manual. OMAP4470, 2010.
[22]
V. Jujjuri, E. V. Hensbergen, A. Liguori, and B. Pulavarty. VirtFS - a virtualization aware file system pass-through. In Proceedings of the Linux Symposium, Ottawa, Canada, July 2010.
[23]
O. Krieger and M. Stumm. HFS: A performance-oriented flexible file system based on building-block compositions. ACM Transactions on Computer Systems, 15(3): 286--321, Aug. 1997.
[24]
F. X. Lin, Z. Wang, and L. Zhong. K2: a mobile operating system for heterogeneous coherence domains. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 285--300, Salt Lake City, UT, Mar. 2014.
[25]
M. M. K. Martin, M. D. Hill, and D. J. Sorin. Why on-chip cache coherence is here to stay. Communications of the ACM, 55(7): 78--89, July 2012.
[26]
J. Mickens, E. Nightingale, J. Elson, B. Fan, A. Kadav, V. Chidambaram, O. Khan, K. Nareddy, and D. Gehring. Blizzard: Fast, cloud-scale block storage for cloud-oblivious applications. In Proceedings of the 11th Symposium on Networked Systems Design and Implementation (NSDI), Seattle, WA, Apr. 2014.
[27]
S. Muir and J. Smith. Functional divisions in the Piglet multiprocessor operating system. In Proceedings of the 8th ACM SIGOPS European Workshop on Support for Composing Distributed Applications, pages 255--260, Sintra, Portugal, 1998.
[28]
E. B. Nightingale, O. Hodson, R. McIlroy, C. Hawblitzel, and G. Hunt. Helios: Heterogeneous multiprocessing with satellite kernels. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP), pages 221--234, Big Sky, MT, Oct. 2009.
[29]
E. B. Nightingale, J. Elson, J. Fan, O. Hofmann, J. Howell, and Y. Suzue. Flat datacenter storage. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI), Hollywood, CA, Oct. 2012.
[30]
B. Pawlowski, C. Juszczak, P. Staubach, C. Smith, D. Lebel, and D. Hitz. NFS version 3 design and implementation. In Proceedings of the Summer 1994 USENIX Technical Conference, Boston, MA, June 1994.
[31]
J. Petersson. What is linux-gate.so.1? http://www.trilithium.com/johan/2005/08/linux-gate/.
[32]
R. Pike, D. Presotto, K. Thompson, H. Trickey, and P. Winterbottom. The use of name spaces in Plan 9. ACMSIGOPS Operating System Review, 27(2): 72--76, Apr. 1993.
[33]
K. W. Preslan, A. P. Barry, J. Brassow, G. Erickson, E. Nygaard, C. Sabol, S. R. Soltis, D. Teigland, and M. T. O'Keefe. A 64-bit, shared disk file system for linux. In Proceedings of the IEEE Symposium on Mass Storage Systems, San Diego, CA, Mar. 1999.
[34]
J. Reinders and J. Jeffers. Intel Xeon Phi Coprocessor High Performance Programming. Morgan Kaufmann, 2013.
[35]
T.-I. Salomie, I. E. Subasu, J. Giceva, and G. Alonso. Database engines on multicores, why parallelize when you can distribute? In Proceedings of the ACM EuroSys Conference, Salzburg, Austria, Apr. 2011.
[36]
F. Schmuck and R. Haskin. GPFS: A shared-disk file system for large computing clusters. In Proceedings of the Conference on File and Storage Technologies (FAST), Monterey, CA, Jan. 2002.
[37]
M. Silberstein, B. Ford, I. Keidar, and E. Witchel. GPUfs: integrating a file system with GPUs. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Houston, TX, Mar. 2013.
[38]
X. Song, H. Chen, R. Chen, Y. Wang, and B. Zang. A case for scaling applications to many-core with OS clustering. In Proceedings of the ACM EuroSys Conference, Salzburg, Austria, Apr. 2011.
[39]
M. Stocker, M. Nevill, and S. Gerber. A messaging interface to disks. http://www.barrelfish.org/stocker-nevill-gerber-dslab-disk.pdf, 2011.
[40]
A. S. Tanenbaum, R. van Renesse, H. van Staveren, G. J. Sharp, and S. J. Mullender. Experiences with the Amoeba distributed operating system. Communications of the ACM, 33(12): 46--63, Dec. 1990.
[41]
B. Verghese, S. Devine, A. Gupta, and M. Rosenblum. Operating system support for improving data locality on CC-NUMA compute servers. ACM SIGOPS Operating Systems Review, 30(5): 279--289, Sept. 1996.
[42]
S. A. Weil, S. A. Brandt, E. L. Miller, and D. D. Long. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI), Seattle, WA, Nov. 2006.
[43]
Y. Weinsberg, D. Dolev, T. Anker, M. Ben-Yehuda, and P. Wyckoff. Tapping into the fountain of CPUs: On operating system support for programmable devices. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 179--188, Seattle, WA, Mar. 2008.
[44]
D. Wentzlaff, C. Gruenwald, III, N. Beckmann, K. Modzelewski, A. Belay, L. Youseff, J. Miller, and A. Agarwal. An operating system for multicore and clouds: Mechanisms and implementation. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC), Indianapolis, IN, June 2010.
[45]
S. Whitehouse. The GFS2 filesystem. In Proceedings of the Linux Symposium, Ottawa, Canada, June 2007.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '15: Proceedings of the Tenth European Conference on Computer Systems
April 2015
503 pages
ISBN:9781450332385
DOI:10.1145/2741948
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2015

Check for updates

Qualifiers

  • Research-article

Conference

EuroSys '15
Sponsor:
EuroSys '15: Tenth EuroSys Conference 2015
April 21 - 24, 2015
Bordeaux, France

Acceptance Rates

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)102
  • Downloads (Last 6 weeks)13
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media