Score: 2

Gradual Metaprogramming

Published: June 10, 2025 | arXiv ID: 2506.09043v3

By: Tianyu Chen , Darshal Shetty , Jeremy G. Siek and more

BigTech Affiliations: Meta

Potential Business Impact:

Finds coding mistakes earlier in data programs.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Data engineers increasingly use domain-specific languages (DSLs) to generate the code for data pipelines. Such DSLs are often embedded in Python. Unfortunately, there are challenges in debugging the generation of data pipelines: an error in a Python DSL script is often detected too late, after the execution of the script, and the source code location that triggers the error is hard to pinpoint. In this paper, we focus on the scenario where a DSL embedded in Python (so it is dynamically-typed) generates data pipeline description code that is statically-typed. We propose gradual metaprogramming to (1) provide a migration path toward statically typed DSLs, (2) immediately provide earlier detection of code generation type errors, and (3) report the source code location responsible for the type error. Gradual metaprogramming accomplishes this by type checking code fragments and incrementally performing runtime checks as they are spliced together. We define MetaGTLC, a metaprogramming calculus in which a gradually-typed metalanguage manipulates a statically-typed object language, and give semantics to it by translation to the cast calculus MetaCC. We prove that successful metaevaluation always generates a well-typed object program and mechanize the proof in Agda.

Country of Origin
πŸ‡ΊπŸ‡Έ United States

Repos / Data Links

Page Count
14 pages

Category
Computer Science:
Programming Languages