Score: 0

Application-Specific Component-Aware Structured Pruning of Deep Neural Networks via Soft Coefficient Optimization

Published: July 20, 2025 | arXiv ID: 2507.14882v1

By: Ganesh Sundaram , Jonas Ulmen , Amjad Haider and more

Potential Business Impact:

Makes smart computer programs smaller, still work well.

Business Areas:

Application Performance Management Data and Analytics, Software

Deep neural networks (DNNs) offer significant versatility and performance benefits, but their widespread adoption is often hindered by high model complexity and computational demands. Model compression techniques such as pruning have emerged as promising solutions to these challenges. However, it remains critical to ensure that application-specific performance characteristics are preserved during compression. In structured pruning, where groups of structurally coherent elements are removed, conventional importance metrics frequently fail to maintain these essential performance attributes. In this work, we propose an enhanced importance metric framework that not only reduces model size but also explicitly accounts for application-specific performance constraints. We employ multiple strategies to determine the optimal pruning magnitude for each group, ensuring a balance between compression and task performance. Our approach is evaluated on an autoencoder tasked with reconstructing MNIST images. Experimental results demonstrate that the proposed method effectively preserves task-relevant performance, maintaining the model's usability even after substantial pruning, by satisfying the required application-specific criteria.

Enhanced Pruning Strategy for Multi-Component Neural Architectures Using Component-Aware Graph Analysis

Machine Learning (CS)

Makes big computer brains smaller without losing smarts.

17 Apr 2025 0

89%

Lightweight and Post-Training Structured Pruning for On-Device Large Lanaguage Models

Machine Learning (CS)

Makes big AI models work on small phones.

25 Jan 2025 0

89%

Integrating Pruning with Quantization for Efficient Deep Neural Networks Compression

Neural and Evolutionary Computing

Makes smart computer programs smaller and faster.

4 Sep 2025 0

View PDF Login to Bookmark

Page Count

6 pages

Application-Specific Component-Aware Structured Pruning of Deep Neural Networks via Soft Coefficient Optimization

Makes smart computer programs smaller, still work well.

Technical Abstract

Enhanced Pruning Strategy for Multi-Component Neural Architectures Using Component-Aware Graph Analysis

Lightweight and Post-Training Structured Pruning for On-Device Large Lanaguage Models

Integrating Pruning with Quantization for Efficient Deep Neural Networks Compression