research-article

Static Type Recommendation for Python

Authors:

Lu ZhangAuthors Info & Claims

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Article No.: 98, Pages 1 - 13

https://doi.org/10.1145/3551349.3561150

Published: 05 January 2023 Publication History

Abstract

Recently, Python has adopted optional type annotation to support type checking and program documentation. However, to enjoy the benefits, developers have to manually write type annotations, which is recognized to be a time-consuming task. To alleviate human efforts on manual type annotation, machine-learning-based approaches have been proposed to recommend types based on code features. However, they suffer from the correctness problem, i.e., the recommended types cannot pass type checking. To address the correctness problem of the machine-learning-based approaches, in this paper, we present a static type recommendation approach, named Stray. Stray can recommend types correctly. We evaluate Stray by comparing it against four state-of-art type recommendation approaches, and find that Stray outperforms these baselines by over 30% absolute improvement in both precision and recall.

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, Savannah, GA, 265–283. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi

Digital Library

[2]

Ole Agesen. 1995. The Cartesian Product Algorithm: Simple and Precise Type Inference Of Parametric Polymorphism. In Proceedings of the 9th European Conference on Object-Oriented Programming(ECOOP ’95). Springer-Verlag, Berlin, Heidelberg, 2–26.

[3]

Ole Agesen and Urs Hölzle. 1995. Type Feedback vs. Concrete Type Inference: A Comparison of Optimization Techniques for Object-Oriented Languages. In Proceedings of the Tenth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications (Austin, Texas, USA) (OOPSLA ’95). Association for Computing Machinery, New York, NY, USA, 91–107. https://doi.org/10.1145/217838.217847

Digital Library

[4]

Ole Agesen and David Ungar. 1994. Sifting out the Gold: Delivering Compact Applications from an Exploratory Object-Oriented Programming Environment. In Proceedings of the Ninth Annual Conference on Object-Oriented Programming Systems, Language, and Applications (Portland, Oregon, USA) (OOPSLA ’94). Association for Computing Machinery, New York, NY, USA, 355–370. https://doi.org/10.1145/191080.191135

Digital Library

[5]

Miltiadis Allamanis, Earl T. Barr, Soline Ducousso, and Zheng Gao. 2020. Typilus: Neural Type Hints. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (London, UK) (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 91–105. https://doi.org/10.1145/3385412.3385997

Digital Library

[6]

Simon Anders, Paul Theodor Pyl, and Wolfgang Huber. 2015. HTSeq—a Python framework to work with high-throughput sequencing data. bioinformatics 31, 2 (2015), 166–169.

[7]

Kim Barrett, Bob Cassels, Paul Haahr, David A. Moon, Keith Playford, and P. Tucker Withington. 1996. A Monotonic Superclass Linearization for Dylan. SIGPLAN Not. 31, 10 (oct 1996), 69–82. https://doi.org/10.1145/236338.236343

Digital Library

[8]

J. Conrod. 2022. imp. https://github.com/jayconrod/imp-interpreter, Last accessed on 2022-04-05.

[9]

Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages. 238–252.

Digital Library

[10]

Patrick Cousot and Radhia Cousot. 2002. Modular Static Program Analysis. In Compiler Construction, R. Nigel Horspool (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 159–179.

[11]

Luis Damas and Robin Milner. 1982. Principal Type-Schemes for Functional Programs. In Proceedings of the 9th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (Albuquerque, New Mexico) (POPL ’82). Association for Computing Machinery, New York, NY, USA, 207–212. https://doi.org/10.1145/582153.582176

Digital Library

[12]

Brian A Davey and Hilary A Priestley. 2002. Introduction to lattices and order. Cambridge university press.

[13]

Monika Dhok, Murali Krishna Ramanathan, and Nishant Sinha. 2016. Type-aware concolic testing of JavaScript programs. In Proceedings of the 38th International Conference on Software Engineering. 168–179.

Digital Library

[14]

V. Dupras. 2022. icemu. https://github.com/hsoft/icemu, Last accessed on 2022-04-05.

[15]

Sébastien Eustace. 2022. pendulum. https://github.com/sdispater/pendulum, Last accessed on 2022-04-05.

[16]

Asger Feldthaus, Max Schäfer, Manu Sridharan, Julian Dolby, and Frank Tip. 2013. Efficient construction of approximate call graphs for JavaScript IDE services. In 2013 35th International Conference on Software Engineering (ICSE). IEEE, 752–761.

[17]

Allen Institute for AI. 2022. relex. https://github.com/allenai/comb_dist_direct_relex, Last accessed on 2022-04-05.

[18]

Django Software Foundation. 2022. Django Home Page. https://www.djangoproject.com/, Last accessed on 2022-04-05.

[19]

The Python Software Foundation. 2022. Mypy Home Page. https://mypy.readthedocs.io/en/stable/, Last accessed on 2022-04-05.

[20]

The Python Software Foundation. 2022. Mypy Wiki Page. https://github.com/python/mypy/wiki/Type-Checker, Last accessed on 2022-04-05.

[21]

The Python Software Foundation. 2022. Python Lexical Analysis. https://docs.python.org/3/reference/lexical_analysis.html, Last accessed on 2022-04-05.

[22]

The Python Software Foundation. 2022. Python Typeshed. https://github.com/python/typeshed, Last accessed on 2022-04-05.

[23]

Łukasz Langa Guido van Rossum, Jukka Lehtosalo. 2014. PEP 484 – Type Hints. Technical Report.

[24]

Charles Harris, K Millman, Stéfan Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten Kerkwijk, Matthew Brett, Allan Haldane, Jaime Río, Mark Wiebe, Pearu Peterson, and Travis Oliphant. 2020. Array programming with NumPy. Nature 585 (09 2020), 357–362. https://doi.org/10.1038/s41586-020-2649-2

[25]

Mostafa Hassan, Caterina Urban, Marco Eilers, and Peter Müller. 2018. MaxSMT-Based Type Inference for Python 3. In Computer Aided Verification, Hana Chockler and Georg Weissenbacher (Eds.). Springer International Publishing, Cham, 12–19.

[26]

Xincheng He, Lei Xu, Xiangyu Zhang, Rui Hao, Yang Feng, and Baowen Xu. 2021. PyART: Python API Recommendation in Real-Time. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1634–1645.

[27]

Vincent J. Hellendoorn, Christian Bird, Earl T. Barr, and Miltiadis Allamanis. 2018. Deep Learning Type Inference. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC/FSE 2018). Association for Computing Machinery, New York, NY, USA, 152–162. https://doi.org/10.1145/3236024.3236051

Digital Library

[28]

Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl. 2004. Evaluating Collaborative Filtering Recommender Systems. ACM Trans. Inf. Syst. 22, 1 (jan 2004), 5–53. https://doi.org/10.1145/963770.963772

Digital Library

[29]

Paul Hudak and Joseph H Fasel. 1992. A gentle introduction to Haskell. ACM Sigplan Notices 27, 5 (1992), 1–52.

Digital Library

[30]

Meta Platforms Inc.2022. Pyre Home Page. https://pyre-check.org/, Last accessed on 2022-04-05.

[31]

Łukasz Langa Ivan Levkivskyi, Jukka Lehtosalo. 2017. PEP 544 – Protocols: Structural subtyping (static duck typing). Technical Report.

[32]

J Robert Johansson, Paul D Nation, and Franco Nori. 2012. QuTiP: An open-source Python framework for the dynamics of open quantum systems. Computer Physics Communications 183, 8 (2012), 1760–1772.

[33]

Maximilian A Köhl. 2021. An executable structural operational formal semantics for python. arXiv preprint arXiv:2109.03139(2021).

[34]

Xavier Leroy, Damien Doligez, Alain Frisch, Jacques Garrigue, Didier Rémy, and Jérôme Vouillon. 2021. The OCaml system release 4.13: Documentation and user’s manual. Ph. D. Dissertation. Inria.

[35]

Guangjie Li, Hui Liu, Ge Li, Sijie Shen, and Hanlin Tang. 2020. LSTM-based argument recommendation for non-API methods. Science China Information Sciences 63, 9 (2020), 1–22.

[36]

Google LLC. 2022. Pytype Home Page. https://google.github.io/pytype/, Last accessed on 2022-04-05.

[37]

Stephan Lukasczyk and Gordon Fraser. 2022. Pynguin: Automated Unit Test Generation for Python. CoRR abs/2202.05218(2022). arXiv:2202.05218

[38]

Rabee Sohail Malik, Jibesh Patra, and Michael Pradel. 2019. NL2Type: Inferring JavaScript Function Types from Natural Language Information. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 304–315. https://doi.org/10.1109/ICSE.2019.00045

Digital Library

[39]

Jukka Lehtosalo Michael Lee, Ivan Levkivskyi. 2019. PEP 586 – Literal Types. Technical Report.

[40]

Robin Milner. 1978. A theory of type polymorphism in programming. J. Comput. System Sci. 17, 3 (1978), 348–375. https://doi.org/10.1016/0022-0000(78)90014-4

[41]

Amir M Mir, Evaldas Latoskinas, Sebastian Proksch, and Georgios Gousios. 2021. Type4py: Deep similarity learning-based type inference for python. arXiv preprint arXiv:2101.04470(2021).

[42]

Lj Miranda. 2022. seagull. https://github.com/ljvmiranda921/seagull, Last accessed on 2022-04-05.

[43]

Raphaël Monat, Abdelraouf Ouadjaout, and Antoine Miné. 2020. Static type analysis by abstract interpretation of Python programs. In 34th European Conference on Object-Oriented Programming (ECOOP 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.

[44]

James O’Beirne. 2022. tinychain. https://github.com/jamesob/tinychain, Last accessed on 2022-04-05.

[45]

John-Paul Ore, Carrick Detweiler, and Sebastian Elbaum. 2021. An empirical study on type annotations: Accuracy, speed, and suggestion effectiveness. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 2(2021), 1–29.

Digital Library

[46]

John-Paul Ore, Sebastian Elbaum, Carrick Detweiler, and Lambros Karkazis. 2018. Assessing the type annotation burden. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 190–201.

Digital Library

[47]

Nicholas Oxhøj, Jens Palsberg, and Michael I. Schwartzbach. 1992. Making type inference practical. In ECOOP ’92 European Conference on Object-Oriented Programming, Ole Lehrmann Madsen (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 329–349.

[48]

Jens Palsberg and Michael I. Schwartzbach. 1991. Object-Oriented Type Inference. In Conference Proceedings on Object-Oriented Programming Systems, Languages, and Applications (Phoenix, Arizona, USA) (OOPSLA ’91). Association for Computing Machinery, New York, NY, USA, 146–161. https://doi.org/10.1145/117954.117965

Digital Library

[49]

Benjamin C Pierce. 2002. Types and programming languages. MIT press.

Digital Library

[50]

John Plevyak and Andrew A. Chien. 1994. Precise Concrete Type Inference for Object-Oriented Languages. In Proceedings of the Ninth Annual Conference on Object-Oriented Programming Systems, Language, and Applications (Portland, Oregon, USA) (OOPSLA ’94). Association for Computing Machinery, New York, NY, USA, 324–340. https://doi.org/10.1145/191080.191130

Digital Library

[51]

David Powell. 2022. htmlark. https://github.com/BitLooter/htmlark, Last accessed on 2022-04-05.

[52]

Michael Pradel, Georgios Gousios, Jason Liu, and Satish Chandra. 2020. Typewriter: Neural type prediction with search-based validation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 209–220.

Digital Library

[53]

Ingkarat Rak-amnouykit, Daniel McCrevan, Ana Milanova, Martin Hirzel, and Julian Dolby. 2020. Python 3 types in the wild: a tale of two type systems. In Proceedings of the 16th ACM SIGPLAN International Symposium on Dynamic Languages. 57–70.

[54]

B. Rhodes. 2022. adventure. https://github.com/brandon-rhodes/python-adventure, Last accessed on 2022-04-05.

[55]

Jeremy Siek and Walid Taha. 2007. Gradual typing for objects. In European Conference on Object-Oriented Programming. Springer, 2–27.

Digital Library

[56]

The software quality company. 2022. TIOBE Index for April 2022. https://www.tiobe.com/tiobe-index/, Last accessed on 2022-04-05.

[57]

Home Assistant Core Team and Community. 2022. Home-assistant Home Page. https://www.home-assistant.io/, Last accessed on 2022-04-05.

[58]

Guido van Rossum. 2022. The Python Language Reference. Technical Report.

[59]

ETH Zurich. 2022. scion. https://github.com/scionproto/scion, Last accessed on 2022-04-05.

Cited By

Wu JLemieux C(2024)QuAC: Quick Attribute-Centric Type Inference for PythonProceedings of the ACM on Programming Languages10.1145/36897838:OOPSLA2(2040-2069)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689783
Xu SShen JLi YYao YYu PXu FMa X(2024)On the Heterophily of Program Graphs: A Case Study of Graph-based Type InferenceProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671389(1-10)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3671016.3671389
Guo YChen ZChen LXu WLi YZhou YXu B(2024)Generating Python Type Annotations from Type Inference: How Far Are We?ACM Transactions on Software Engineering and Methodology10.1145/365215333:5(1-38)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3652153
Show More Cited By

Index Terms

Static Type Recommendation for Python
1. Mathematics of computing
  1. Discrete mathematics
    1. Combinatorics
      1. Combinatorial optimization
2. Software and its engineering
  1. Software notations and tools

Recommendations

Python 3 types in the wild: a tale of two type systems
DLS 2020: Proceedings of the 16th ACM SIGPLAN International Symposium on Dynamic Languages

Python 3 is a highly dynamic language, but it has introduced a syntax for expressing types with PEP484. This paper explores how developers use these type annotations, the type system semantics provided by type checking and inference tools, and the ...
Static Type Analysis for Python
WISA '14: Proceedings of the 2014 11th Web Information System and Application Conference

Python is a kind of dynamic-typed language which provides flexibility but leaves the programmer without the benefits of static typing. This paper describes Type, a tool that works for static type annotation and inference for python. It could simulate ...
QuAC: Quick Attribute-Centric Type Inference for Python

Python’s dynamic typing facilitates rapid prototyping and underlies its popularity in many domains. However, dynamic typing reduces the power of many static checking and bug-finding tools. Python type annotations can make these tools more useful. Type ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

October 2022

2006 pages

ISBN:9781450394758

DOI:10.1145/3551349

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ASE '22

ASE '22: 37th IEEE/ACM International Conference on Automated Software Engineering

October 10 - 14, 2022

MI, Rochester, USA

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
243
Total Downloads

Downloads (Last 12 months)83
Downloads (Last 6 weeks)6

Reflects downloads up to 07 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wu JLemieux C(2024)QuAC: Quick Attribute-Centric Type Inference for PythonProceedings of the ACM on Programming Languages10.1145/36897838:OOPSLA2(2040-2069)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689783
Xu SShen JLi YYao YYu PXu FMa X(2024)On the Heterophily of Program Graphs: A Case Study of Graph-based Type InferenceProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671389(1-10)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3671016.3671389
Guo YChen ZChen LXu WLi YZhou YXu B(2024)Generating Python Type Annotations from Type Inference: How Far Are We?ACM Transactions on Software Engineering and Methodology10.1145/365215333:5(1-38)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3652153
Del Carpio AAngarita L(2023)Assistant Solutions in Software Engineering: A Systematic Literature Review2023 IEEE 14th International Conference on Software Engineering and Service Science (ICSESS)10.1109/ICSESS58500.2023.10293029(93-100)Online publication date: 17-Oct-2023
https://doi.org/10.1109/ICSESS58500.2023.10293029

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents