skip to main content
10.1145/3372885.3373823acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article
Public Access

REPLica: REPL instrumentation for Coq analysis

Published: 22 January 2020 Publication History

Abstract

Proof engineering tools make it easier to develop and maintain large systems verified using interactive theorem provers. Developing useful proof engineering tools hinges on understanding the development processes of proof engineers. This paper breaks down one barrier to achieving that understanding: remotely collecting granular data on proof developments as they happen.
We have built a tool called REPLica that instruments Coq’s interaction model in order to collect fine-grained data on proof developments. It is decoupled from the user interface, and designed in a way that generalizes to other interactive theorem provers with similar interaction models.
We have used REPLica to collect data over the span of a month from a group of intermediate through expert proof engineers—enough data to reconstruct hundreds of interactive sessions. The data reveals patterns in fixing proofs and in changing programs and specifications useful for the improvement of proof engineering tools. Our experiences conducting this study suggest design considerations both at the level of the study and at the level of the interactive theorem prover that can facilitate future studies of this kind.

References

[1]
Mark Adams. 2015. Refactoring Proofs with Tactician. In Software Engineering and Formal Methods, Domenico Bianculli, Radu Calinescu, and Bernhard Rumpe (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 53–67.
[2]
June Andronick, Ross Jeffery, Gerwin Klein, Rafal Kolanski, Mark Staples, He Zhang, and Liming Zhu. 2012. Large-Scale Formal Verification in Practice: A Process Perspective. In International Conference on Software Engineering. ACM, Zurich, Switzerland, 1002–1011.
[3]
David Aspinall. 2000. Proof General: A Generic Tool for Proof Development. In Tools and Algorithms for the Construction and Analysis of Systems: 6th International Conference, TACAS 2000 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2000 Berlin, Germany, March 25 – April 2, 2000 Proceedings. Springer, Berlin, Heidelberg, 38–43.
[4]
David Aspinall and Cezary Kaliszyk. 2016. Towards Formal Proof Metrics. In Fundamental Approaches to Software Engineering. Springer, Berlin, Heidelberg, 325–341.
[5]
David Aspinall and Cezary Kaliszyk. 2016. What’s in a Theorem Name?. In Interactive Theorem Proving. Springer International Publishing, Cham, 459–465.
[6]
Jasmin Christian Blanchette, Maximilian Haslbeck, Daniel Matichuk, and Tobias Nipkow. 2015. Mining the Archive of Formal Proofs. In Intelligent Computer Mathematics: International Conference, CICM 2015, Washington, DC, USA, July 13-17, 2015, Proceedings. Springer International Publishing, Cham, 3–17.
[7]
Olivier Boite. 2004. Proof Reuse with Extended Inductive Types. In Theorem Proving in Higher Order Logics: 17th International Conference, TPHOLs 2004, Park City, Utah, USA, September 14-17, 2004. Proceedings, Konrad Slind, Annette Bunker, and Ganesh Gopalakrishnan (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 50–65.
[8]
Timothy Bourke, Matthias Daum, Gerwin Klein, and Rafal Kolanski. 2012. Challenges and Experiences in Managing Large-Scale Proofs. In Intelligent Computer Mathematics. Springer, Berlin, Heidelberg, 32–48.
[9]
Adam Chlipala. 2017. Formal Reasoning About Programs. http: //adam.chlipala.net/frap/
[10]
Coq Development Team. 1989-2018. The Coq Commands: Customization at launch time. http://coq.inria.fr/refman/practical- tools/coqcommands.html#customization- at- launch- time
[11]
Coq Development Team. 1989-2019. The Coq Proof Assistant. http: //coq.inria.fr
[12]
Coq Development Team. 1999-2018. The Coq Commands. http: //coq.inria.fr/refman/practical- tools/coq- commands.html
[13]
Coq Development Team. 1999-2018. Coq Integrated Development Environment. http://coq.inria.fr/refman/practical- tools/coqide.html
[14]
Dominik Dietrich, Iain Whiteside, and David Aspinall. 2013. Polar: A Framework for Proof Refactoring. In Logic for Programming, Artificial Intelligence, and Reasoning. Springer, Berlin, Heidelberg, 776–791.
[15]
S. G. Eick, T. L. Graves, A. F. Karr, J. S. Marron, and A. Mockus. 2001. Does code decay? Assessing the evidence from change management data. IEEE Transactions on Software Engineering 27, 1 (Jan 2001), 1–12.
[16]
Jim Fehrle. 2018. Pull Request: Highlight differences between successive proof steps (color, underline, etc.). http://github.com/coq/coq/ pull/6801
[17]
Emilio Jesús Gallego Arias. 2016. SerAPI: Machine-Friendly, DataCentric Serialization for Coq. Technical Report. MINES ParisTech. https://hal- mines- paristech.archives- ouvertes.fr/hal- 01384408
[18]
Thibault Gauthier, Cezary Kaliszyk, and Josef Urban. 2017. TacticToe: Learning to Reason with HOL4 Tactics. In LPAR-21. 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning (EPiC Series in Computing), Vol. 46. EasyChair, 125–143.
[19]
Jónathan Heras, Ekaterina Komendantskaya, Moa Johansson, and Ewen Maclean. 2013. Proof-Pattern Recognition and Lemma Discovery in ACL2. In Logic for Programming, Artificial Intelligence, and Reasoning: 19th International Conference, LPAR-19, Stellenbosch, South Africa, December 14-19, 2013. Proceedings. Springer, Berlin, Heidelberg, 389–406.
[20]
HOL Development Team. 2016-2018. Running hol. https://holtheorem- prover.org/guidebook/#running- hol
[21]
Mik Kersten and Gail C. Murphy. 2005. Mylar: A Degree-of-interest Model for IDEs. In Proceedings of the 4th International Conference on Aspect-oriented Software Development (AOSD ’05). ACM, New York, NY, USA, 159–168.
[22]
Amy Ko and Brad Myers. 2005. A framework and methodology for studying the causes of software errors in programming systems. Journal of Visual Languages & Computing 16 (02 2005), 41–84.
[23]
A. J. Ko, Brad A. Myers, Michael J. Coblenz, and Htet Htet Aung. 2006. An Exploratory Study of How Developers Seek, Relate, and Collect Relevant Information during Software Maintenance Tasks. IEEE Transactions on Software Engineering 32 (2006).
[24]
Ekaterina Komendantskaya, Jónathan Heras, and Gudmund Grov. 2012. Machine Learning in Proof General: Interfacing Interfaces. In Proceedings 10th International Workshop On User Interfaces for Theorem Provers, UITP 2012, Bremen, Germany, July 11th, 2012. 15–41.
[25]
Thomas D. LaToza, David Garlan, James D. Herbsleb, and Brad A. Myers. 2007. Program Comprehension As Fact Finding. In Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (ESEC-FSE ’07). ACM, New York, NY, USA, 361– 370.
[26]
D. Matichuk, T. Murray, J. Andronick, R. Jeffery, G. Klein, and M. Staples. 2015. Empirical Study Towards a Leading Indicator for Cost of Formal Software Verification. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. 722–732.
[27]
Victor Cacciari Miraldo, Pierre-Évariste Dagand, and Wouter Swierstra. 2017. Type-directed Diffing of Structured Data. In Proceedings of the 2Nd ACM SIGPLAN International Workshop on Type-Driven Development (TyDe 2017). ACM, New York, NY, USA, 2–15.
[28]
Anne Mulhern. 2006. Proof Weaving. In In Proceedings of the First Informal ACM SIGPLAN Workshop on Mechanizing Metatheory.
[29]
Tamara Munzner, Francois Guimbretiere, Serdar Tasiran, Li Zhang, and Yunhong Zhou. 2003. TreeJuxtaposer: Scalable Tree Comparison Using Focus+Context with Guaranteed Visibility. ACM Trans. Graph. 22 (07 2003), 453–462.
[30]
Toby Murray and P. C. van Oorschot. 2018. BP: Formal Proofs, the Fine Print and Side Effects. In IEEE Cybersecurity Development (SecDev). 1–10.
[31]
Kıvanç Muşlu, Yuriy Brun, Michael D. Ernst, and David Notkin. 2015. Reducing feedback delay of software development tools via continuous analysis. IEEE Transactions on Software Engineering 41, 8 (Aug. 2015), 745–763.
[32]
Magnus O. Myreen. 2008-2018. Guide to HOL4 interaction and basic proofs. http://hol- theorem- prover.org/HOL- interaction.pdf
[33]
Yutaka Nagashima and Yilun He. 2018. PaMpeR: Proof Method Recommendation System for Isabelle/HOL. In Proceedings of the International Conference on Automated Software Engineering (ASE 2018). ACM, New York, NY, USA, 362–372.
[34]
Zoe Paraskevopoulou, Cătălin Hritçu, Maxime Dénès, Leonidas Lampropoulos, and Benjamin C. Pierce. 2015. Foundational Property-Based Testing. In Interactive Theorem Proving: 6th International Conference, ITP 2015, Nanjing, China, August 24-27, 2015, Proceedings. Springer International Publishing, Cham, 325–343.
[35]
Talia Ringer, Karl Palmskog, Ilya Sergey, Milos Gligoric, and Zachary Tatlock. 2019. QED at Large: A Survey of Engineering of Formally Verified Software. Foundations and Trends® in Programming Languages 5, 2-3 (2019), 102–281.
[36]
Talia Ringer, Nathaniel Yazdani, John Leo, and Dan Grossman. 2018. Adapting Proof Automation to Adapt Proofs. In Proceedings of the 7th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2018). ACM, New York, NY, USA, 115–129.
[37]
Talia Ringer, Nathaniel Yazdani, John Leo, and Dan Grossman. 2019. Ornaments for Proof Reuse in Coq. In 10th International Conference on Interactive Theorem Proving (ITP 2019) (Leibniz International Proceedings in Informatics (LIPIcs)), John Harrison, John O’Leary, and Andrew Tolmach (Eds.), Vol. 141. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 26:1–26:19.
[38]
Valentin Robert. 2018. Front-end tooling for building and maintaining dependently-typed functional programs. Ph.D. Dissertation. UC San Diego.
[39]
Valentin Robert and Sorin Lerner. 2014-2016. PeaCoq. http://goto. ucsd.edu/peacoq/
[40]
Martin P. Robillard, Wesley Coelho, and Gail C. Murphy. 2004. How Effective Developers Investigate Source Code: An Exploratory Study. IEEE Trans. Softw. Eng. 30, 12 (Dec. 2004), 889–903.
[41]
Kenneth Roe and Scott Smith. 2016. CoqPIE: An IDE Aimed at Improving Proof Development Productivity. In Interactive Theorem Proving: 7th International Conference, ITP 2016, Nancy, France, August 22-25, 2016, Proceedings. Springer International Publishing, Cham, 491–499.
[42]
Alex Sanchez-Stern, Yousef Alhessi, Lawrence K. Saul, and Sorin Lerner. 2019. Generating Correctness Proofs with Neural Networks. CoRR abs/1907.07794 (2019). arXiv: 1907.07794 http://arxiv.org/abs/1907. 07794
[43]
Mark Staples, Ross Jeffery, June Andronick, Toby Murray, Gerwin Klein, and Rafal Kolanski. 2014. Productivity for Proof Engineering. In Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM ’14). ACM, New York, NY, USA, Article 15, 4 pages.
[44]
Mark Staples, Rafal Kolanski, Gerwin Klein, Corey Lewis, June Andronick, Toby Murray, Ross Jeffery, and Len Bass. 2013. Formal Specifications Better Than Function Points for Code Sizing. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). IEEE Press, Piscataway, NJ, USA, 1257–1260.
[45]
The Idris Community. 2017. The Idris REPL. http://docs.idris- lang. org/en/latest/reference/repl.html
[46]
Makarius Wenzel. 2012. Isabelle/jEdit – A Prover IDE within the PIDE Framework. In Intelligent Computer Mathematics. Springer, Berlin, Heidelberg, 468–471.
[47]
Makarius Wenzel. 2014. Asynchronous User Interaction and Tool Integration in Isabelle/PIDE. In Interactive Theorem Proving: 5th International Conference, ITP 2014, Held as Part of the Vienna Summer of Logic, VSL 2014, Vienna, Austria, July 14-17, 2014. Proceedings. Springer International Publishing, Cham, 515–530.
[48]
Iain Johnston Whiteside. 2013. Refactoring proofs. Ph.D. Dissertation. University of Edinburgh. http://hdl.handle.net/1842/7970
[49]
Karin Wibergh. 2019. Automatic refactoring for Agda. Master’s thesis. Chalmers University of Technology and University of Gothenburg.
[50]
Freek Wiedijk. 2009. Statistics on digital libraries of mathematics. Studies in Logic, Grammar and Rhetoric 18(31) (2009), 137–151.
[51]
Doug Woos, James R. Wilcox, Steve Anton, Zachary Tatlock, Michael D. Ernst, and Thomas Anderson. 2016. Planning for Change in a Formal Verification of the Raft Consensus Protocol. In Proceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs (CPP 2016). ACM, New York, NY, USA, 154–165.
[52]
Kaiyu Yang and Jia Deng. 2019. Learning to Prove Theorems via Interacting with Proof Assistants. In International Conference on Machine Learning.
[53]
He Zhang, Gerwin Klein, Mark Staples, June Andronick, Liming Zhu, and Rafal Kolanski. 2012. Simulation Modeling of A Large Scale Formal Verification Process. In International Conference on Software and Systems Process. IEEE, Zurich, Switzerland, 3–12.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CPP 2020: Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs
January 2020
381 pages
ISBN:9781450370974
DOI:10.1145/3372885
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 January 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. proof engineering
  2. study methodologies
  3. user interaction

Qualifiers

  • Research-article

Funding Sources

Conference

POPL '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 18 of 26 submissions, 69%

Upcoming Conference

POPL '26

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)174
  • Downloads (Last 6 weeks)27
Reflects downloads up to 02 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)CoqPyt: Proof Navigation in Python in the Era of LLMsCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663814(637-641)Online publication date: 10-Jul-2024
  • (2024)Evaluating language models for mathematics through interactionsProceedings of the National Academy of Sciences10.1073/pnas.2318124121121:24Online publication date: 3-Jun-2024
  • (2023)A Grounded Conceptual Model for Ownership Types in RustProceedings of the ACM on Programming Languages10.1145/36228417:OOPSLA2(1224-1252)Online publication date: 16-Oct-2023
  • (2023)Baldur: Whole-Proof Generation and Repair with Large Language ModelsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616243(1229-1241)Online publication date: 30-Nov-2023
  • (2023)Passport: Improving Automated Formal Verification Using IdentifiersACM Transactions on Programming Languages and Systems10.1145/359337445:2(1-30)Online publication date: 26-Jun-2023
  • (2021)Proof repair across type equivalencesProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454033(112-127)Online publication date: 19-Jun-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media