A Survey on Deep Neural Network Pruning-Taxonomy, Comparison, Analysis, and Recommendations logo

A Survey on Deep Neural Network Pruning-Taxonomy, Comparison, Analysis, and Recommendations

Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources.

GitHub Link

The GitHub link is https://github.com/hrcheng1066/awesome-pruning

Introduce

The GitHub repository "awesome-pruning" is a comprehensive collection of neural network pruning research and open-source code. The repository covers various aspects of pruning neural networks, including static and dynamic pruning, learning and pruning strategies, and applications in computer vision, natural language processing, and audio signal processing. The repository organizes pruning methods based on different criteria, such as timing of pruning and specific techniques, and provides an extensive list of relevant papers and associated resources. The work aims to offer a valuable resource for researchers and practitioners interested in the field of neural network pruning. Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources.

Content

Taxonomy: In our survey, we provide a comprehensive review of the state-of-the-art in deep neural network pruning, which we categorize along five orthogonal axes: Universal/Specific Speedup, When to Prune, Pruning Criteria, Learn to Prune, and Fusion of Pruning and Other Techniques.

Alternatives & Similar Tools

LongLLaMA-handle very long text contexts, up to 256,000 tokens logo

LongLLaMA is a large language model designed to handle very long text contexts, up to 256,000 tokens. It's based on OpenLLaMA and uses a technique called Focused Transformer (FoT) for training. The repository provides a smaller 3B version of LongLLaMA for free use. It can also be used as a replacement for LLaMA models with shorter contexts.