Skip to content
Discussion options

You must be logged in to vote

Beyond int8 quantization, the biggest memory savings usually come from reducing input resolution or channels, using depthwise separable convolutions, replacing fully connected layers with global average pooling, and trimming the widest layers to lower peak activation memory.

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by dahlia1384
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant