An Enumerative Embedding of the Python Type System in ACL2s
By: Samuel Xifaras , Panagiotis Manolios , Andrew T. Walter and more
Potential Business Impact:
Finds hidden bugs in computer programs.
Python is a high-level interpreted language that has become an industry standard in a wide variety of applications. In this paper, we take a first step towards using ACL2s to reason about Python code by developing an embedding of a subset of the Python type system in ACL2s. The subset of Python types we support includes many of the most commonly used type annotations as well as user-defined types comprised of supported types. We provide ACL2s definitions of these types, as well as defdata enumerators that are customized to provide code coverage and identify errors in Python programs. Using the ACL2s embedding, we can generate instances of types that can then be used as inputs to fuzz Python programs, which allows us to identify bugs in Python code that are not detected by state-of-the-art Python type checkers. We evaluate our work against four open-source repositories, extracting their type information and generating inputs for fuzzing functions with type signatures that are in the supported subset of Python types. Note that we only use the type signatures of functions to generate inputs and treat the bodies of functions as black boxes. We measure code coverage, which ranges from about 68% to more than 80%, and identify code patterns that hinder coverage such as complex branch conditions and external file system dependencies. We conclude with a discussion of the results and recommendations for future work.
Similar Papers
Navigating the Python Type Jungle
Programming Languages
Makes Python code easier to understand and fix.
Automated Type Annotation in Python Using Large Language Models
Programming Languages
Helps computers understand code better automatically.
FLAT: Formal Languages as Types
Software Engineering
Keeps computer programs from making mistakes with text.