Survey Pruning 1 - 2022-Methods For Pruning Deep

This paper surveys over 150 studies on methods for pruning deep neural networks, categorizing them into three main approaches: magnitude-based pruning, clustering methods, and sensitivity analysis. It provides a comprehensive resource for comparing results across various architectures and datasets, highlighting key studies and their findings. The paper concludes by identifying research gaps and suggesting future directions in the field of neural network pruning.

Uploaded by

Randa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views21 pages

Survey Pruning 1 - 2022-Methods For Pruning Deep

Uploaded by

Randa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Received April 13, 2022, accepted June 5, 2022, date of publication June 13, 2022, date of current version

June 20, 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3182659

Methods for Pruning Deep Neural Networks

SUNIL VADERA AND SALEM AMEEN
School of Science, Engineering and Environment, University of Salford, Salford M5 4WT, U.K.
Corresponding author: Sunil Vadera ([email protected])

ABSTRACT This paper presents a survey of methods for pruning deep neural networks. It begins by
categorising over 150 studies based on the underlying approach used and then focuses on three categories:
methods that use magnitude based pruning, methods that utilise clustering to identify redundancy, and
methods that use sensitivity analysis to assess the effect of pruning. Some of the key influencing studies
within these categories are presented to highlight the underlying approaches and results achieved. Most
studies present results which are distributed in the literature as new architectures, algorithms and data sets
have developed with time, making comparison across different studied difficult. The paper therefore provides
a resource for the community that can be used to quickly compare the results from many different methods
on a variety of data sets, and a range of architectures, including AlexNet, ResNet, DenseNet and VGG. The
resource is illustrated by comparing the results published for pruning AlexNet and ResNet50 on ImageNet
and ResNet56 and VGG16 on the CIFAR10 data to reveal which pruning methods work well in terms
of retaining accuracy whilst achieving good compression rates. The paper concludes by identifying some
research gaps and promising directions for future research.

INDEX TERMS Deep learning, neural networks, pruning deep networks.

I. INTRODUCTION an increasing number and variety of algorithms for pruning

Deep learning and its use in high profile applications such as deep neural networks. Hence, this paper presents a survey of
autonomous vehicles [1], predicting breast cancer [2], speech recent work on pruning neural networks that can be used to
recognition [3] and natural language processing [4] have understand the types of algorithms developed, appreciate the
propelled interest in Artificial Intelligence to new heights, key ideas underpinning the algorithms and gain familiarity
with most countries making it central to their industrial and with the major approaches and issues in the field. The paper
commercial strategies for innovation. aims to achieve this goal by presenting the progressive path
Although there are different types of architectures [5], from the earlier algorithms to the recent work, categorising
deep networks typically consist of layers of neurons that are algorithms based on the approach used, contrasting the simi-
connected to neurons in preceding layers via weighted links. larities and differences between the algorithms and conclud-
Another characteristic, which is considered central to their ing with some directions for future research.
predictive power [6], is that they have a large number of The studies on pruning methods all carry out empirical
parameters that need to be learned, with networks such as evaluations that compare the performance of algorithms on
ResNet50 [7] having more than 25 million parameters and different architectures and benchmark data sets. These evalu-
VGG16 [8] having more than 138 million weights. An obvi- ations have evolved as new deep learning architectures have
ous question, therefore, is to ask whether it is possible to developed, as new data sets have become available and as
develop smaller, more efficient networks without compro- new pruning algorithms have been proposed. This paper also
mising accuracy? One direction of work aimed at addressing provides a useful resource that brings together the reported
this question has been to first train a large network and then results in one place, allowing researchers to quickly compare
to prune and fine-tune a network. Although methods for the reported results on different architectures and data sets.
pruning shallow neural networks were proposed in the 1980s The survey identified over 150 studies on pruning neu-
and 90s [9]–[11], recent advances in deep learning and its ral networks, which can be categorised into the fol-
potential for applications in embedded systems has led to lowing eight groups based on the underlying approach
used:
The associate editor coordinating the review of this manuscript and 1) Magnitude based pruning methods [12]–[15], which
approving it for publication was Shenghong Li. are based on the view that the saliency of weights and

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
63280 VOLUME 10, 2022
S. Vadera, S. Ameen: Methods for Pruning Deep Neural Networks

FIGURE 1. A selection of pruning methods grouped in terms of the approach adopted.

neurons can be determined by local measures such as 8) Hybrid methods [47]–[49], which utilise a combi-
their magnitude. nation of methods aimed at taking advantage of the
2) Similarity and clustering methods [16]–[21], which cumulative compressing effects of the different types
aim to identify duplicate or similar weights which of methods.
are redundant and can be pruned without impacting Table 1 classifies over 150 studies identified by the sur-
accuracy. vey into the 8 categories, enabling researchers working on
3) Sensitivity analysis methods [9], [22]–[27], that a particular type of method to locate related studies. Given
assess the effect of removing or perturbing weights on the range of studies, and availability of surveys already cov-
the loss and then remove a proportion of the weights ering some of the above categories, this paper focuses on
that have least impact on accuracy. recent algorithms in the first three categories for pruning.
4) Knowledge distillation methods [28]–[31], which Reed [11] provides an excellent survey of pruning methods
utilise the original model, termed the Teacher, prior to the deep learning era. Readers interested in the use
to learn a more compact new model called the of quantization, low rank and knowledge distillation methods
Student. are referred to the survey by Lebedev et al. [50] and read-
5) Low rank methods [27], [32], [33], that factor a ers interested in architectural design methods are referred
weight matrix into a product of two smaller matri- to the comprehensive survey by Elsken et al. [51]. Pruning
ces which can then be used to perform an equivalent networks is just one step in developing efficient models and
function more efficiently than the single larger weight a recent survey by Menghan [52] summarises the full range
matrix. of methods, from use of quantization and learning, to the
6) Quantization methods [34]–[39], which are based on available software and hardware infrastructure for efficient
using quantization, hashing, low precision and binary deployment of models. Another important direction of work,
representations of the weights as a way of reducing the worthy of a survey in its own right, and not in the scope
computations. of this paper, is the use of variational Bayesian methods for
7) Architectural design methods [40]–[46], that utilise regularization [53]–[59].
intelligent search and reinforcement learning methods Fig. 1 shows a selection of the methods covered in greater
to generate neural network architectures. detail in this survey and includes a sub-categorization of