Score: 0

How Deep Does Your Dependency Tree Go? An Empirical Study of Dependency Amplification Across 10 Package Ecosystems

Published: December 12, 2025 | arXiv ID: 2512.14739v1

By: Jahidul Arafat

Potential Business Impact:

Makes software safer by checking hidden code.

Business Areas:
Application Performance Management Data and Analytics, Software

Modern software development relies on package ecosystems where a single declared dependency can pull in many additional transitive packages. This dependency amplification, defined as the ratio of transitive to direct dependencies, has major implications for software supply chain security, yet amplification patterns across ecosystems have not been compared at scale. We present an empirical study of 500 projects across ten major ecosystems, including Maven Central for Java, npm Registry for JavaScript, crates io for Rust, PyPI for Python, NuGet Gallery for dot NET, RubyGems for Ruby, Go Modules for Go, Packagist for PHP, CocoaPods for Swift and Objective C, and Pub for Dart. Our analysis shows that Maven exhibits mean amplification of 24.70 times, compared to 4.48 times for Go Modules, 4.32 times for npm, and 0.32 times for CocoaPods. We find significant differences with large effect sizes in 22 of 45 pairwise comparisons, challenging the assumption that npm has the highest amplification due to its many small purpose packages. We observe that 28 percent of Maven projects exceed 10 times amplification, indicating a systematic pattern rather than isolated outliers, compared to 14 percent for RubyGems, 12 percent for npm, and zero percent for Cargo, PyPI, Packagist, CocoaPods, and Pub. We attribute these differences to ecosystem design choices such as dependency resolution behavior, standard library completeness, and platform constraints. Our findings suggest adopting ecosystem specific security strategies, including systematic auditing for Maven environments, targeted outlier detection for npm and RubyGems, and continuation of current practices for ecosystems with controlled amplification. We provide a full replication package with data and analysis scripts.

Country of Origin
🇺🇸 United States

Page Count
14 pages

Category
Computer Science:
Software Engineering