Deep Compression

2017-10-25  本文已影响24人  信步闲庭v

Approach

We introduce “deep compression”, a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35× to 49× without affecting their accuracy.

Our method first prunes the network by learning only the important connections. Next, we quantize the weights to enforce weight sharing, finally, we apply Huffman coding.


Experiment

References:
D EEP C OMPRESSION : C OMPRESSING D EEP N EURALN ETWORKS WITH PRUNING , T RAINED Q UANTIZATION AND H UFFMAN C ODING, Song Han, 2016, ICLR

上一篇 下一篇

猜你喜欢

热点阅读