skip to main content
10.1145/3551349.3561150acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Static Type Recommendation for Python

Published: 05 January 2023 Publication History

Abstract

Recently, Python has adopted optional type annotation to support type checking and program documentation. However, to enjoy the benefits, developers have to manually write type annotations, which is recognized to be a time-consuming task. To alleviate human efforts on manual type annotation, machine-learning-based approaches have been proposed to recommend types based on code features. However, they suffer from the correctness problem, i.e., the recommended types cannot pass type checking. To address the correctness problem of the machine-learning-based approaches, in this paper, we present a static type recommendation approach, named Stray. Stray can recommend types correctly. We evaluate Stray by comparing it against four state-of-art type recommendation approaches, and find that Stray outperforms these baselines by over 30% absolute improvement in both precision and recall.

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, Savannah, GA, 265–283. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi
[2]
Ole Agesen. 1995. The Cartesian Product Algorithm: Simple and Precise Type Inference Of Parametric Polymorphism. In Proceedings of the 9th European Conference on Object-Oriented Programming(ECOOP ’95). Springer-Verlag, Berlin, Heidelberg, 2–26.
[3]
Ole Agesen and Urs Hölzle. 1995. Type Feedback vs. Concrete Type Inference: A Comparison of Optimization Techniques for Object-Oriented Languages. In Proceedings of the Tenth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications (Austin, Texas, USA) (OOPSLA ’95). Association for Computing Machinery, New York, NY, USA, 91–107. https://doi.org/10.1145/217838.217847
[4]
Ole Agesen and David Ungar. 1994. Sifting out the Gold: Delivering Compact Applications from an Exploratory Object-Oriented Programming Environment. In Proceedings of the Ninth Annual Conference on Object-Oriented Programming Systems, Language, and Applications (Portland, Oregon, USA) (OOPSLA ’94). Association for Computing Machinery, New York, NY, USA, 355–370. https://doi.org/10.1145/191080.191135
[5]
Miltiadis Allamanis, Earl T. Barr, Soline Ducousso, and Zheng Gao. 2020. Typilus: Neural Type Hints. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (London, UK) (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 91–105. https://doi.org/10.1145/3385412.3385997
[6]
Simon Anders, Paul Theodor Pyl, and Wolfgang Huber. 2015. HTSeq—a Python framework to work with high-throughput sequencing data. bioinformatics 31, 2 (2015), 166–169.
[7]
Kim Barrett, Bob Cassels, Paul Haahr, David A. Moon, Keith Playford, and P. Tucker Withington. 1996. A Monotonic Superclass Linearization for Dylan. SIGPLAN Not. 31, 10 (oct 1996), 69–82. https://doi.org/10.1145/236338.236343
[8]
J. Conrod. 2022. imp. https://github.com/jayconrod/imp-interpreter, Last accessed on 2022-04-05.
[9]
Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages. 238–252.
[10]
Patrick Cousot and Radhia Cousot. 2002. Modular Static Program Analysis. In Compiler Construction, R. Nigel Horspool (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 159–179.
[11]
Luis Damas and Robin Milner. 1982. Principal Type-Schemes for Functional Programs. In Proceedings of the 9th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (Albuquerque, New Mexico) (POPL ’82). Association for Computing Machinery, New York, NY, USA, 207–212. https://doi.org/10.1145/582153.582176
[12]
Brian A Davey and Hilary A Priestley. 2002. Introduction to lattices and order. Cambridge university press.
[13]
Monika Dhok, Murali Krishna Ramanathan, and Nishant Sinha. 2016. Type-aware concolic testing of JavaScript programs. In Proceedings of the 38th International Conference on Software Engineering. 168–179.
[14]
V. Dupras. 2022. icemu. https://github.com/hsoft/icemu, Last accessed on 2022-04-05.
[15]
Sébastien Eustace. 2022. pendulum. https://github.com/sdispater/pendulum, Last accessed on 2022-04-05.
[16]
Asger Feldthaus, Max Schäfer, Manu Sridharan, Julian Dolby, and Frank Tip. 2013. Efficient construction of approximate call graphs for JavaScript IDE services. In 2013 35th International Conference on Software Engineering (ICSE). IEEE, 752–761.
[17]
Allen Institute for AI. 2022. relex. https://github.com/allenai/comb_dist_direct_relex, Last accessed on 2022-04-05.
[18]
Django Software Foundation. 2022. Django Home Page. https://www.djangoproject.com/, Last accessed on 2022-04-05.
[19]
The Python Software Foundation. 2022. Mypy Home Page. https://mypy.readthedocs.io/en/stable/, Last accessed on 2022-04-05.
[20]
The Python Software Foundation. 2022. Mypy Wiki Page. https://github.com/python/mypy/wiki/Type-Checker, Last accessed on 2022-04-05.
[21]
The Python Software Foundation. 2022. Python Lexical Analysis. https://docs.python.org/3/reference/lexical_analysis.html, Last accessed on 2022-04-05.
[22]
The Python Software Foundation. 2022. Python Typeshed. https://github.com/python/typeshed, Last accessed on 2022-04-05.
[23]
Łukasz Langa Guido van Rossum, Jukka Lehtosalo. 2014. PEP 484 – Type Hints. Technical Report.
[24]
Charles Harris, K Millman, Stéfan Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten Kerkwijk, Matthew Brett, Allan Haldane, Jaime Río, Mark Wiebe, Pearu Peterson, and Travis Oliphant. 2020. Array programming with NumPy. Nature 585 (09 2020), 357–362. https://doi.org/10.1038/s41586-020-2649-2
[25]
Mostafa Hassan, Caterina Urban, Marco Eilers, and Peter Müller. 2018. MaxSMT-Based Type Inference for Python 3. In Computer Aided Verification, Hana Chockler and Georg Weissenbacher (Eds.). Springer International Publishing, Cham, 12–19.
[26]
Xincheng He, Lei Xu, Xiangyu Zhang, Rui Hao, Yang Feng, and Baowen Xu. 2021. PyART: Python API Recommendation in Real-Time. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1634–1645.
[27]
Vincent J. Hellendoorn, Christian Bird, Earl T. Barr, and Miltiadis Allamanis. 2018. Deep Learning Type Inference. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC/FSE 2018). Association for Computing Machinery, New York, NY, USA, 152–162. https://doi.org/10.1145/3236024.3236051
[28]
Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl. 2004. Evaluating Collaborative Filtering Recommender Systems. ACM Trans. Inf. Syst. 22, 1 (jan 2004), 5–53. https://doi.org/10.1145/963770.963772
[29]
Paul Hudak and Joseph H Fasel. 1992. A gentle introduction to Haskell. ACM Sigplan Notices 27, 5 (1992), 1–52.
[30]
Meta Platforms Inc.2022. Pyre Home Page. https://pyre-check.org/, Last accessed on 2022-04-05.
[31]
Łukasz Langa Ivan Levkivskyi, Jukka Lehtosalo. 2017. PEP 544 – Protocols: Structural subtyping (static duck typing). Technical Report.
[32]
J Robert Johansson, Paul D Nation, and Franco Nori. 2012. QuTiP: An open-source Python framework for the dynamics of open quantum systems. Computer Physics Communications 183, 8 (2012), 1760–1772.
[33]
Maximilian A Köhl. 2021. An executable structural operational formal semantics for python. arXiv preprint arXiv:2109.03139(2021).
[34]
Xavier Leroy, Damien Doligez, Alain Frisch, Jacques Garrigue, Didier Rémy, and Jérôme Vouillon. 2021. The OCaml system release 4.13: Documentation and user’s manual. Ph. D. Dissertation. Inria.
[35]
Guangjie Li, Hui Liu, Ge Li, Sijie Shen, and Hanlin Tang. 2020. LSTM-based argument recommendation for non-API methods. Science China Information Sciences 63, 9 (2020), 1–22.
[36]
Google LLC. 2022. Pytype Home Page. https://google.github.io/pytype/, Last accessed on 2022-04-05.
[37]
Stephan Lukasczyk and Gordon Fraser. 2022. Pynguin: Automated Unit Test Generation for Python. CoRR abs/2202.05218(2022). arXiv:2202.05218
[38]
Rabee Sohail Malik, Jibesh Patra, and Michael Pradel. 2019. NL2Type: Inferring JavaScript Function Types from Natural Language Information. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 304–315. https://doi.org/10.1109/ICSE.2019.00045
[39]
Jukka Lehtosalo Michael Lee, Ivan Levkivskyi. 2019. PEP 586 – Literal Types. Technical Report.
[40]
Robin Milner. 1978. A theory of type polymorphism in programming. J. Comput. System Sci. 17, 3 (1978), 348–375. https://doi.org/10.1016/0022-0000(78)90014-4
[41]
Amir M Mir, Evaldas Latoskinas, Sebastian Proksch, and Georgios Gousios. 2021. Type4py: Deep similarity learning-based type inference for python. arXiv preprint arXiv:2101.04470(2021).
[42]
Lj Miranda. 2022. seagull. https://github.com/ljvmiranda921/seagull, Last accessed on 2022-04-05.
[43]
Raphaël Monat, Abdelraouf Ouadjaout, and Antoine Miné. 2020. Static type analysis by abstract interpretation of Python programs. In 34th European Conference on Object-Oriented Programming (ECOOP 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
[44]
James O’Beirne. 2022. tinychain. https://github.com/jamesob/tinychain, Last accessed on 2022-04-05.
[45]
John-Paul Ore, Carrick Detweiler, and Sebastian Elbaum. 2021. An empirical study on type annotations: Accuracy, speed, and suggestion effectiveness. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 2(2021), 1–29.
[46]
John-Paul Ore, Sebastian Elbaum, Carrick Detweiler, and Lambros Karkazis. 2018. Assessing the type annotation burden. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 190–201.
[47]
Nicholas Oxhøj, Jens Palsberg, and Michael I. Schwartzbach. 1992. Making type inference practical. In ECOOP ’92 European Conference on Object-Oriented Programming, Ole Lehrmann Madsen (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 329–349.
[48]
Jens Palsberg and Michael I. Schwartzbach. 1991. Object-Oriented Type Inference. In Conference Proceedings on Object-Oriented Programming Systems, Languages, and Applications (Phoenix, Arizona, USA) (OOPSLA ’91). Association for Computing Machinery, New York, NY, USA, 146–161. https://doi.org/10.1145/117954.117965
[49]
Benjamin C Pierce. 2002. Types and programming languages. MIT press.
[50]
John Plevyak and Andrew A. Chien. 1994. Precise Concrete Type Inference for Object-Oriented Languages. In Proceedings of the Ninth Annual Conference on Object-Oriented Programming Systems, Language, and Applications (Portland, Oregon, USA) (OOPSLA ’94). Association for Computing Machinery, New York, NY, USA, 324–340. https://doi.org/10.1145/191080.191130
[51]
David Powell. 2022. htmlark. https://github.com/BitLooter/htmlark, Last accessed on 2022-04-05.
[52]
Michael Pradel, Georgios Gousios, Jason Liu, and Satish Chandra. 2020. Typewriter: Neural type prediction with search-based validation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 209–220.
[53]
Ingkarat Rak-amnouykit, Daniel McCrevan, Ana Milanova, Martin Hirzel, and Julian Dolby. 2020. Python 3 types in the wild: a tale of two type systems. In Proceedings of the 16th ACM SIGPLAN International Symposium on Dynamic Languages. 57–70.
[54]
B. Rhodes. 2022. adventure. https://github.com/brandon-rhodes/python-adventure, Last accessed on 2022-04-05.
[55]
Jeremy Siek and Walid Taha. 2007. Gradual typing for objects. In European Conference on Object-Oriented Programming. Springer, 2–27.
[56]
The software quality company. 2022. TIOBE Index for April 2022. https://www.tiobe.com/tiobe-index/, Last accessed on 2022-04-05.
[57]
Home Assistant Core Team and Community. 2022. Home-assistant Home Page. https://www.home-assistant.io/, Last accessed on 2022-04-05.
[58]
Guido van Rossum. 2022. The Python Language Reference. Technical Report.
[59]
ETH Zurich. 2022. scion. https://github.com/scionproto/scion, Last accessed on 2022-04-05.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering
October 2022
2006 pages
ISBN:9781450394758
DOI:10.1145/3551349
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. python
  2. static analysis
  3. type inference
  4. type recommendation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ASE '22

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)83
  • Downloads (Last 6 weeks)6
Reflects downloads up to 07 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media