skip to main content
research-article
Open access

DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps

Published: 20 June 2023 Publication History

Abstract

Providing strong fault-tolerant guarantees for the modern cloud is difficult, as application developers must coordinate between independent stateful services and ephemeral compute and handle various failure-induced anomalies. We propose Composable Resilient Steps (CReSt), a new abstraction for resilient cloud applications. CReSt uses fault-tolerant steps as its core building block, which allows participants to receive, process, and send messages as a single uninterruptible atomic unit. Composability and reliability are orthogonally achieved by reusable CReSt implementations, for example, leveraging reliable message queues. Thus, CReSt application builders focus solely on translating application logic into steps, and infrastructure builders focus on efficient CReSt implementations. We propose one such implementation called DARQ (for Deduplicated Asynchronously Recoverable Queues). At its core, DARQ is a storage service that encapsulates CReSt participant state and enforces CReSt semantics; developers attach ephemeral compute nodes to DARQ instances to implement stateful distributed components. Services built with DARQ are resilient by construction, and CReSt-compatible services naturally compose without loss of resilience. For performance, we propose a novel speculative execution scheme to execute CReSt steps without waiting for message persistence in DARQ, effectively eliding cloud persistence overheads; our scheme maintains CReSt's fault-tolerance guarantees and automatically restores to a consistent system state upon failure. We showcase the generality of CReSt and DARQ using two applications: cloud streaming and workflow processing. Experiments show that DARQ is able to achieve extremely low latency and high throughput across these use cases, often beating state-of-the-art customized solutions.

Supplemental Material

MP4 File
Video for paper 117: DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps
PDF File
Read me
ZIP File
Source Code

References

[1]
Amazon Step Functions. https://aws.amazon.com/step-functions/, retrieved 13-Oct-2022.
[2]
Dv2 and DSv2-series. https://learn.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series, retrieved 13-Oct-2022.
[3]
Lsv3-series Virtual Machines. https://learn.microsoft.com/en-us/azure/virtual-machines/lsv3-series, retrieved 13-Oct-2022.
[4]
Proximity Placement Groups. https://learn.microsoft.com/en-us/azure/virtual-machines/co-location, retrieved 13-Oct-2022.
[5]
Saga Pattern. https://microservices.io/patterns/data/saga.html, retrieved 13-Oct-2022.
[6]
Transactional Outbox Pattern. https://microservices.io/patterns/data/transactional-outbox.html, retrieved 13-Oct-2022.
[7]
Apache Kafka. https://kafka.apache.org/, retrieved 15-Jan-2023.
[8]
Kubernetes Pod Lifecycle. https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/, retrieved 18-Jan-2023.
[9]
Streams and Tables in Apache Kafka: A Primer. https://www.confluent.io/blog/kafka-streams-tables-part-1-event-streaming/, retrieved 19-Jan-2023.
[10]
Confluent Developer -- Optimizing for Latency. https://docs.confluent.io/cloud/current/client-apps/optimizing/latency.html, retrieved 26-Sept-2022.
[11]
Fine-tune Kafka performance with the Kafka optimization theorem. https://developers.redhat.com/articles/2022/05/03/fine-tune-kafka-performance-kafka-optimization-theorem#, retrieved 26-Sept-2022.
[12]
Amazon Lambda. https://aws.amazon.com/lambda/, retrieved 28-Aug-2022.
[13]
Amazon S3. https://aws.amazon.com/s3/, retrieved 28-Aug-2022.
[14]
Azure Cosmos DB. https://azure.microsoft.com/en-us/services/cosmos-db/, retrieved 28-Aug-2022.
[15]
Azure Durable Functions. https://docs.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview, retrieved 28-Aug-2022.
[16]
Azure Functions. https://azure.microsoft.com/en-us/services/functions/, retrieved 28-Aug-2022.
[17]
Azure Service Fabric. https://azure.microsoft.com/en-us/services/service-fabric/, retrieved 28-Aug-2022.
[18]
Create a Windows VM with accelerated networking using Azure PowerShell. https://docs.microsoft.com/en-us/azure/virtual-network/create-vm-accelerated-networking-powershell, retrieved 28-Aug-2022.
[19]
FasterLog and the Microsoft FASTER project. https://github.com/microsoft/FASTER, retrieved 28-Aug-2022.
[20]
Kubernetes. https://kubernetes.io/, retrieved 28-Aug-2022.
[21]
Temporal. https://temporal.io/, retrieved 28-Aug-2022.
[22]
Azure Blob Storage. https://azure.microsoft.com/en-us/services/storage/blobs/, retrieved 30-Aug-2022.
[23]
Kafka Streams. https://kafka.apache.org/documentation/streams/, retrieved 30-Aug-2022.
[24]
Sidecar Pattern. https://docs.microsoft.com/en-us/azure/architecture/patterns/sidecar, retrieved 30-Aug-2022.
[25]
Azure Event Hubs. https://azure.microsoft.com/en-us/services/event-hubs/, retrieved 31-Aug-2022.
[26]
M. Balakrishnan, C. Shen, A. Jafri, S. Mapara, D. Geraghty, J. Flinn, V. Venkat, I. Nedelchev, S. Ghosh, M. Dharamshi, J. Liu, F. Gruszczynski, J. Li, R. Tibrewal, A. Zaveri, R. Nagar, A. Yossef, F. Richard, and Y. J. Song. Log-structured protocols in delos. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, SOSP '21, page 538--552, New York, NY, USA, 2021. Association for Computing Machinery.
[27]
R. Barga, D. Lomet, G. Shegalov, and G. Weikum. Recovery guarantees for internet applications. ACM Trans. Internet Technol., 4(3):289--328, aug 2004.
[28]
R. Barga and D. B. Lomet. Phoenix: Making applications robust. SIGMOD Rec., 28(2):562--564, jun 1999.
[29]
P. A. Bernstein, M. Hsu, and B. Mann. Implementing recoverable requests using queues. SIGMOD Rec., 19(2):112--122, may 1990.
[30]
S. Burckhardt, B. Chandramouli, C. Gillum, D. Justo, K. Kallas, C. McMahon, C. S. Meiklejohn, and X. Zhu. Netherite: Efficient execution of serverless workflows. Proc. VLDB Endow., 15(8):1591--1604, apr 2022.
[31]
S. Burckhardt, C. Gillum, D. Justo, K. Kallas, C. McMahon, and C. S. Meiklejohn. Durable functions: Semantics for stateful serverless. Proc. ACM Program. Lang., 5(OOPSLA), oct 2021.
[32]
B. Chandramouli, J. Goldstein, M. Barnett, R. DeLine, D. Fisher, J. C. Platt, J. F. Terwilliger, and J. Wernsing. Trill: A high-performance incremental query processor for diverse analytics. Proc. VLDB Endow., 8(4):401--412, dec 2014.
[33]
K. M. Chandy and L. Lamport. Distributed snapshots: Determining global states of distributed systems. ACM Trans. Comput. Syst., 3(1):63--75, feb 1985.
[34]
B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and A. Warfield. Remus: High availability via asynchronous virtual machine replication. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, NSDI'08, page 161--174, USA, 2008. USENIX Association.
[35]
U. Dayal, M. Hsu, and R. Ladin. A transactional model for long-running activities. In Proceedings of the 17th International Conference on Very Large Data Bases, VLDB '91, page 113--122, San Francisco, CA, USA, 1991. Morgan Kaufmann Publishers Inc.
[36]
G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution replay of multiprocessor virtual machines. In Proceedings of the Fourth ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '08, page 121--130, New York, NY, USA, 2008. Association for Computing Machinery.
[37]
E. N. M. Elnozahy, L. Alvisi, Y.-M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv., 34(3):375--408, sep 2002.
[38]
I. Gog, M. Isard, and M. Abadi. Falkirk wheel: Rollback recovery for dataflow systems. In Proceedings of the ACM Symposium on Cloud Computing, SoCC '21, page 373--387, New York, NY, USA, 2021. Association for Computing Machinery.
[39]
J. Goldstein, A. Abdelhamid, M. Barnett, S. Burckhardt, B. Chandramouli, D. Gehring, N. Lebeck, C. Meiklejohn, U. F. Minhas, R. Newton, R. G. Peshawaria, T. Zaccai, and I. Zhang. A.m.b.r.o.s.i.a: Providing performant virtual resiliency for distributed applications. Proc. VLDB Endow., 13(5):588--601, Jan. 2020.
[40]
C. Gray and D. Cheriton. Leases: An efficient fault-tolerant mechanism for distributed file cache consistency. ACM SIGOPS Operating Systems Review, 23(5):202--210, 1989.
[41]
J. Gray. Notes on data base operating systems. In Advanced Course: Operating Systems, 1978.
[42]
J. Gray and J. Gray. Queues are databases. In In Proceedings 7th High Performance Transaction Processing Workshop. Asilomar CA, page 496. Prentice Hall, 1995.
[43]
Z. Jia and E. Witchel. Boki: Stateful serverless computing with shared logs. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, SOSP '21, page 691--707, New York, NY, USA, 2021. Association for Computing Machinery.
[44]
E. Jonas, J. Schleier-Smith, V. Sreekanti, C.-C. Tsai, A. Khandelwal, Q. Pu, V. Shankar, J. Carreira, K. Krauth, N. Yadwadkar, J. E. Gonzalez, R. A. Popa, I. Stoica, and D. A. Patterson. Cloud programming simplified: A berkeley view on serverless computing, 2019.
[45]
P. Kraft, Q. Li, K. Kaffes, A. Skiadopoulos, D. Kumar, D. Cho, J. Li, R. Redmond, N. Weckwerth, B. Xia, P. Bailis, M. Cafarella, G. Graefe, J. Kepner, C. Kozyrakis, M. Stonebraker, L. Suresh, X. Yu, and M. Zaharia. Apiary: A dbms-backed transactional function-as-a-service framework, 2022.
[46]
H. T. Kung and J. T. Robinson. On optimistic methods for concurrency control. ACM Trans. Database Syst., 6(2):213--226, jun 1981.
[47]
R. Laigner, Y. Zhou, M. A. V. Salles, Y. Liu, and M. Kalinowski. Data management in microservices: State of the practice, challenges, and research directions. Proc. VLDB Endow., 14(13):3348--3361, sep 2021.
[48]
L. Lamport. Paxos made simple. ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001), pages 51--58, 2001.
[49]
B. W. Lampson and H. E. Sturgis. Crash recovery in a distributed data storage system. 1976.
[50]
T. Li, B. Chandramouli, J. M. Faleiro, S. Madden, and D. Kossmann. Asynchronous Prefix Recoverability for Fast Distributed Stores, page 1090--1102. Association for Computing Machinery, New York, NY, USA, 2021.
[51]
T. Li, B. Chandramouli, and S. Madden. Performant almost-latch-free data structures using epoch protection. In Data Management on New Hardware, DaMoN'22, New York, NY, USA, 2022. Association for Computing Machinery.
[52]
H. Liu, H. Jin, X. Liao, L. Hu, and C. Yu. Live migration of virtual machine based on full system trace and replay. In Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, HPDC '09, page 101--110, New York, NY, USA, 2009. Association for Computing Machinery.
[53]
N. A. Lynch and M. R. Tuttle. Hierarchical correctness proofs for distributed algorithms. In Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, PODC '87, page 137--151, New York, NY, USA, 1987. Association for Computing Machinery.
[54]
N. Malviya, A. Weisberg, S. Madden, and M. Stonebraker. Rethinking main memory oltp recovery. In 2014 IEEE 30th International Conference on Data Engineering, pages 604--615, 2014.
[55]
C. Mohan, D. Agrawal, G. Alonso, A. El Abbadi, R. Guenthoer, and M. Kamath. Exotica: A project on advanced transaction management and workflow systems. SIGOIS Bull., 16(1):45--50, aug 1995.
[56]
C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. Aries: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Trans. Database Syst., 17(1):94--162, mar 1992.
[57]
C. Mohan, B. Lindsay, and R. Obermarck. Transaction management in the r* distributed database management system. ACM Trans. Database Syst., 11(4):378--396, dec 1986.
[58]
E. B. Nightingale, P. M. Chen, and J. Flinn. Speculative execution in a distributed file system. SOSP '05, page 191--205, New York, NY, USA, 2005. Association for Computing Machinery.
[59]
D. Ongaro and J. Ousterhout. In search of an understandable consensus algorithm. In 2014 USENIX Annual Technical Conference (Usenix ATC 14), pages 305--319, 2014.
[60]
J. Postel. Rfc0793: Transmission control protocol. Technical report, 1981.
[61]
S. Setty, C. Su, J. R. Lorch, L. Zhou, H. Chen, P. Patel, and J. Ren. Realizing the fault-tolerance promise of cloud storage using locks with intent. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI'16, page 501--516, USA, 2016. USENIX Association.
[62]
P. F. Silvestre, M. Fragkoulis, D. Spinellis, and A. Katsifodimos. Clonos: Consistent causal recovery for highly-available streaming dataflows. In Proceedings of the 2021 International Conference on Management of Data, pages 1637--1650, 2021.
[63]
A. Skiadopoulos, Q. Li, P. Kraft, K. Kaffes, D. Hong, S. Mathew, D. Bestor, M. Cafarella, V. Gadepally, G. Graefe, J. Kepner, C. Kozyrakis, T. Kraska, M. Stonebraker, L. Suresh, and M. Zaharia. Dbos: A dbms-oriented operating system. Proc. VLDB Endow., 15(1):21--30, sep 2021.
[64]
E. Soisalon-Soininen and T. Ylönen. Partial strictness in two-phase locking. In Proceedings of the 5th International Conference on Database Theory, ICDT '95, page 139--147, Berlin, Heidelberg, 1995. Springer-Verlag.
[65]
J. Spenger, P. Carbone, and P. Haller. Portals: An extension of dataflow streaming for stateful serverless. In Proceedings of the 2022 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2022, page 153--171, New York, NY, USA, 2022. Association for Computing Machinery.
[66]
V. Sreekanti, C. Wu, S. Chhatrapati, J. E. Gonzalez, J. M. Hellerstein, and J. M. Faleiro. A fault-tolerance shim for serverless computing. In Proceedings of the Fifteenth European Conference on Computer Systems, EuroSys '20, New York, NY, USA, 2020. Association for Computing Machinery.
[67]
R. Strom and S. Yemini. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst., 3(3):204--226, Aug. 1985.
[68]
G. Wang, L. Chen, A. Dikshit, J. Gustafson, B. Chen, M. J. Sax, J. Roesler, S. Blee-Goldman, B. Cadonna, A. Mehta, V. Madan, and J. Rao. Consistency and completeness: Rethinking distributed stream processing in apache kafka. In Proceedings of the 2021 International Conference on Management of Data, SIGMOD '21, page 2602--2613, New York, NY, USA, 2021. Association for Computing Machinery.
[69]
S. Wang, J. Liagouris, R. Nishihara, P. Moritz, U. Misra, A. Tumanov, and I. Stoica. Lineage stash: Fault tolerance off the critical path. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, SOSP '19, page 338--352, New York, NY, USA, 2019. Association for Computing Machinery.
[70]
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. NSDI'12, page 2, USA, 2012. USENIX Association.
[71]
H. Zhang, A. Cardoza, P. B. Chen, S. Angel, and V. Liu. Fault-tolerant and transactional stateful serverless workflows. In Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation, OSDI'20, USA, 2020. USENIX Association.
[72]
W. Zorgdrager, K. Psarakis, M. Fragkoulis, E. Visser, and A. Katsifodimos. Stateful entities: Object-oriented cloud applications as distributed dataflows. CoRR, abs/2112.00710, 2021.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 1, Issue 2
PACMMOD
June 2023
2310 pages
EISSN:2836-6573
DOI:10.1145/3605748
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2023
Published in PACMMOD Volume 1, Issue 2

Permissions

Request permissions for this article.

Author Tags

  1. cloud programming
  2. distributed system
  3. recoverability
  4. service composition

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)484
  • Downloads (Last 6 weeks)92
Reflects downloads up to 07 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media