Skip to content

Efficient-ML/Awesome-Model-Quantization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

326 Commits
 
 
 
 
 
 

Repository files navigation

Awesome Model Quantization Awesome

This repo collects papers, documents, and codes about model quantization for anyone who wants to research it. We are continuously improving the project. Welcome to PR the works (papers, repositories) that the repo misses.

Benchmarks

1. BiBench: Benchmarking and Analyzing Network Binarization [Paper] [Code] GitHub stars

Venue: ICML 2023

Authors: Haotong Qin, Mingyuan Zhang, Yifu Ding, Aoyu Li, Zhongang Cai, Ziwei Liu, Fisher Yu, Xianglong Liu.

survey

Bibtex
@inproceedings{qin2023bibench,
  title={BiBench: Benchmarking and Analyzing Network Binarization},
  author={Qin, Haotong and Zhang, Mingyuan and Ding, Yifu and Li, Aoyu and Cai, Zhongang and Liu, Ziwei and Yu, Fisher and Liu, Xianglong},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2023}
}

2. An empirical study of LLaMA3 quantization: from LLMs to MLLMs [Paper] [Code] GitHub stars

Venue: Visual Intelligence 2024

Authors: Wei Huang, Xingyu Zheng, Xudong Ma, Haotong Qin, Chengtao Lv, Hong Chen, Jie Luo, Xiaojuan Qi, Xianglong Liu, Michele Magno.

LLaMA3 Quantization Benchmark

Bibtex
@article{huang2024empirical,
  title={An empirical study of llama3 quantization: From llms to mllms},
  author={Huang, Wei and Zheng, Xingyu and Ma, Xudong and Qin, Haotong and Lv, Chengtao and Chen, Hong and Luo, Jie and Qi, Xiaojuan and Liu, Xianglong and Magno, Michele},
  journal={Visual Intelligence},
  volume={2},
  number={1},
  pages={36},
  year={2024},
  publisher={Springer}
}

3. An Empirical Study of Qwen3 Quantization [Paper] [Code] GitHub stars

Venue: Visual Intelligence 2026

Authors: Xingyu Zheng, Yuye Li, Haoran Chu, Yue Feng, Xudong Ma, Jie Luo, Jinyang Guo, Haotong Qin, Michele Magno, Xianglong Liu.

qwen3

Bibtex
@article{zheng2025empirical,
  title={An empirical study of qwen3 quantization},
  author={Zheng, Xingyu and Li, Yuye and Chu, Haoran and Feng, Yue and Ma, Xudong and Luo, Jie and Guo, Jinyang and Qin, Haotong and Magno, Michele and Liu, Xianglong},
  journal={arXiv preprint arXiv:2505.02214},
  year={2025}
}

4. LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit [Paper] [Code] GitHub stars

Venue: EMNLP 2024 Industry Track

Authors: Ruihao Gong, Yang Yong, Shiqiao Gu, Yushi Huang, Chengtao Lv, Yunchen Zhang, Xianglong Liu, Dacheng Tao.

llmc

Bibtex
@inproceedings{gong2024llmc,
  title={Llmc: Benchmarking large language model quantization with a versatile compression toolkit},
  author={Gong, Ruihao and Yong, Yang and Gu, Shiqiao and Huang, Yushi and Lv, Chengtao and Zhang, Yunchen and Tao, Dacheng and Liu, Xianglong},
  booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},
  pages={132--152},
  year={2024}
}

5. RobustMQ: Benchmarking Robustness of Quantized Models [Paper]

Venue: Visual Intelligence 2023

Authors: Yisong Xiao, Aishan Liu, Tianyuan Zhang, Haotong Qin, Jinyang Guo, Xianglong Liu.

robustmq

Bibtex
@article{xiao2023robustmq,
  title={Robustmq: benchmarking robustness of quantized models},
  author={Xiao, Yisong and Liu, Aishan and Zhang, Tianyuan and Qin, Haotong and Guo, Jinyang and Liu, Xianglong},
  journal={Visual Intelligence},
  volume={1},
  number={1},
  pages={30},
  year={2023},
  publisher={Springer}
}

Survey Papers

1. Binary Neural Networks: A Survey [Paper] [Blog]

Venue: Pattern Recognition 2020

Authors: Haotong Qin, Ruihao Gong, Xianglong Liu, Xiao Bai, Jingkuan Song, Nicu Sebe.

survey

Bibtex
@article{Qin:pr20_bnn_survey,
    title = "Binary neural networks: A survey",
    author = "Haotong Qin and Ruihao Gong and Xianglong Liu and Xiao Bai and Jingkuan Song and Nicu Sebe",
    journal = "Pattern Recognition",
    volume = "105",
    pages = "107281",
    year = "2020"
}

2. A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms [Paper]

Venue: Neural Networks 2025

Authors: Ruihao Gong, Yifu Ding, Zining Wang, Chengtao Lv, Xingyu Zheng, Jinyang Du, Yang Yong, Shiqiao Gu, Haotong Qin, Jinyang Guo, Dahua Lin, Michele Magno, Xianglong Liu.

A Survey of Low-bit Large Language Models

Bibtex
@article{gong2025survey,
  title={A survey of low-bit large language models: Basics, systems, and algorithms},
  author={Gong, Ruihao and Ding, Yifu and Wang, Zining and Lv, Chengtao and Zheng, Xingyu and Du, Jinyang and Yong, Yang and Gu, Shiqiao and Qin, Haotong and Guo, Jinyang and others},
  journal={Neural networks},
  pages={107856},
  year={2025},
  publisher={Elsevier}
}

3. Low-bit Model Quantization for Deep Neural Networks: A Survey [Paper]

Venue: arXiv 2025

Authors: Kai Liu, Qian Zheng, Kaiwen Tao, Zhiteng Li, Haotong Qin, Wenbo Li, Yong Guo, Xianglong Liu, Linghe Kong, Guihai Chen, Yulun Zhang, Xiaokang Yang.

quant-survey

Bibtex
@article{liu2025low,
  title={Low-bit model quantization for deep neural networks: A survey},
  author={Liu, Kai and Zheng, Qian and Tao, Kaiwen and Li, Zhiteng and Qin, Haotong and Li, Wenbo and Guo, Yong and Liu, Xianglong and Kong, Linghe and Chen, Guihai and others},
  journal={arXiv preprint arXiv:2505.05530},
  year={2025}
}

Papers

2026

  • [ICLR] PT²-LLM: Post-Training Ternarization for Large Language Models [code] GitHub stars
  • [ICLR] Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
  • [ICLR] DVD-Quant: Data-free Video Diffusion Transformers Quantization
  • [ICLR] Q&C: When Quantization Meets Cache in Efficient Generation
  • [CVPR Findings] Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration
  • [ICLR] Quantized Visual Geometry Grounded Transformer
  • [ICLR] Post-Training Quantization for Video Matting
  • [ICLR] QVGen: Pushing the Limit of Quantized Video Generative Models
  • [ICLR] QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification [code] GitHub stars
  • [AAAI] First-Order Error Matters: Accurate Compensation for Quantized Large Language Models [code] GitHub stars
  • [AAAI] TR-DQ: Time-Rotation Diffusion Quantization
  • [ICLR] TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
  • [ICLR] Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs [code] GitHub stars
  • [ICLR] AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs [code] GitHub stars
  • [ICLR] Tequila: Deadzone-free Ternary Quantization for Large Language Models
  • [ICLR] LogART: Pushing the Limit of Efficient Logarithmic Post-Training Quantization [code] GitHub stars
  • [ICLR] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference [code] GitHub stars
  • [ICLR] Improving Block-Wise LLM Quantization by 4-bit Generalized Normal Float Formats
  • [arXiv] D²Quant: Accurate Low-bit Post-Training Weight Quantization for LLMs
  • [arXiv] QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals
  • [arXiv] SliderQuant: Accurate Post-Training Quantization for LLMs
  • [arXiv] What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
  • [ICLR] Channel-Aware Mixed-Precision Quantization for Efficient Long-Context Inference
  • [ICLR] CodeQuant: Unified Clustering and Quantization for Enhanced Outlier Smoothing in Low-Precision Mixture-of-Experts
  • [ICLR] QeRL: Beyond Efficiency - Quantization-enhanced Reinforcement Learning for LLMs [code] GitHub stars
  • [ICLR] AutoQVLA: Not All Channels Are Equal in Vision-Language-Action Model's Quantization
  • [ICLR] Achieving low-bit Muon through subspace preservation and grid quantization
  • [ICLR] Shift-and-Sum Quantization for Visual Autoregressive Models
  • [ICLR] Inlier-Centric Post-Training Quantization for Object Detection Models
  • [ICLR] Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees
  • [ICLR] BBQ: Boosting Quantization Entropy with Bell Box Quantization
  • [ICLR] Improving Block-Wise LLM Quantization by 4-bit Block-Wise Optimal Float (BOF4): Analysis and Variations [code] GitHub stars
  • [ICLR] Learning under Quantization for High-Dimensional Linear Regression
  • [ICLR] On-the-Fly Adaptation to Quantization: Configuration-Aware LoRA for Efficient Fine-Tuning of Quantized LLMs
  • [ICLR] Bridging the Gap Between Promise and Performance for FP4 Quantization [code] GitHub stars
  • [ICLR] KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models [code] GitHub stars
  • [ICLR] UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs [code] GitHub stars
  • [ICLR] The Lattice Geometry of Neural Network Quantization: A Short Equivalence Proof of GPTQ and Babai's algorithm
  • [ICLR] DPQuant: Efficient and Private Model Training via Dynamic Quantization Scheduling
  • [ICLR] Towards Quantization-Aware Training for Ultra-Low-Bit Reasoning LLMs
  • [ICLR] A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization
  • [ICLR] Training Dynamics Impact Post-Training Quantization Robustness [code] GitHub stars
  • [ICLR] SSDi8: Accurate and Efficient 8-bit Quantization for State Space Duality
  • [ICLR] The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
  • [ICLR] PTQ4ARVG: Post-Training Quantization for AutoRegressive Visual Generation Models [code] GitHub stars
  • [ICLR] QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models [code] GitHub stars
  • [ICLR] Gradient-Aligned Calibration for Post-Training Quantization of Diffusion Models
  • [ICLR] SERQ: Saliency-Aware Low-Rank Error Reconstruction for LLM Quantization
  • [ICLR] Compute-Optimal Quantization-Aware Training
  • [ICLR] PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs [code] GitHub stars
  • [ICLR] Beyond Outliers: A Study of Optimizers Under Quantization
  • [ICLR] Qronos: Correcting the Past by Shaping the Future... in Post-Training Quantization
  • [ICLR] MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models [code] GitHub stars
  • [ICLR] TurboBoA: Faster and Exact Attention-aware Quantization without Backpropagation
  • [ICLR] Beyond Uniformity: Sample and Frequency Meta Weighting for Post-Training Quantization of Diffusion Models
  • [ICLR] Rethinking Residual Errors in Compensation-based LLM Quantization
  • [ICLR] SPR²Q: Static Priority-based Rectifier Routing Quantization for Image Super-Resolution [code] GitHub stars
  • [ICLR] STaMP: Sequence Transformation and Mixed Precision for Low-Precision Activation Quantization

2025

  • [ICML] Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers [code] GitHub stars
  • [AAAI] MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models
  • [ICML] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models [code] GitHub stars
  • [TPAMI] BiVM: Accurate Binarized Neural Network for Efficient Video Matting
  • [NeurIPS] S²Q-VDiT: Accurate Quantized Video Diffusion Transformer with Salient Data and Sparse Token Distillation [code] GitHub stars
  • [CVPR] PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [code] GitHub stars
  • [ICLR] ARB-LLM: Alternating Refined Binarizations for Large Language Models [code] GitHub stars
  • [ICLR] BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models [code] GitHub stars
  • [ICML] FlatQuant: Flatness Matters for LLM Quantization [code] GitHub stars
  • [ICML] RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [code] GitHub stars
  • [ICML] GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models
  • [ICML] Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization [code] GitHub stars
  • [NeurIPS] DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization [code] GitHub stars
  • [AAAI] JAQ: Joint Efficient Architecture Design and Low-Bit Quantization
  • [AAAI] OAC: Output-adaptive Calibration for Accurate Post-Training Quantization of LLMs
  • [AAAI] Optimizing Quantized Diffusion Models via Distillation with Decay Timestep-Aware Loss
  • [AAAI] Quantifiable Quantization Sensitivity of Diffusion Models
  • [AAAI] TCAQ-DM: Timestep-Channel Adaptive Quantization for Diffusion Models
  • [ACL] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models [code] GitHub stars
  • [ACL] L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models
  • [ACL] MoQAE: Mixed-Precision Quantization for Long-Context LLM Inference via Mixture of Quantization-Aware Experts
  • [ACL] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
  • [ACL] PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models [code] GitHub stars
  • [ACL] Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models
  • [ACL] “Give Me BF16 or Give Me Death”? Accuracy-Performance Trade-Offs in LLM Quantization
  • [ACM MM] DilateQuant: Accurate and Efficient Quantization-Aware Training for Diffusion Models via Weight Dilation
  • [ACM MM] Learning Binarized Representations with Pseudo-positive Distillation
  • [ACM MM] MQuant: Unleashing the Inference Potential of Multimodal Large Language Models with Post-Training Quantization
  • [ACM MM] Pushing the Limit of Binarized Neural Network for Image Super Resolution with Smooth Information Transmission
  • [ACM MM] Quantization Meets OOD: Generalizable Quantization-aware Training from a Flatness Perspective
  • [EMNLP] AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models
  • [EMNLP] Does quantization affect models' performance on long-input and long-output tasks?
  • [ICLR] CBQ: Cross-Block Quantization for Large Language Models
  • [ICLR] DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
  • [ICLR] LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
  • [ICLR] OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting [code] GitHub stars
  • [ICLR] QERA: an Analytical Framework for Quantization Error Reconstruction [code] GitHub stars
  • [ICLR] SpinQuant: LLM Quantization with Learned Rotations
  • [ICLR] SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models
  • [ICLR] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
  • [ICML] GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance [code] GitHub stars
  • [ICML] ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals [code] GitHub stars
  • [NeurIPS] A Double Normalization Approach for Calibration-Free Low-Bit KV Cache Quantization
  • [NeurIPS] Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression
  • [NeurIPS] Learning Grouped Lattice Vector Quantizers for Low-Bit Large Language Models
  • [NeurIPS] LittleBit: Ultra Low-Bit Quantization via Latent Factorization
  • [NeurIPS] ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization
  • [NeurIPS] Q-Palette: Fractional-Bit Quantizers Toward Optimal Weight-Only Post-Training Quantization
  • [NeurIPS] Wavelet-Enhanced High-Fidelity 1-Bit Quantization for LLMs
  • [ACL Findings] Achieving Binary Weight and Activation for LLMs using Post-Training Quantization
  • [EMNLP Findings] KurTail: Kurtosis-based LLM Quantization
  • [SIGMOD] Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search [code] GitHub stars
  • [NeurIPS] QBasicVSR: Temporal Awareness Adaptation Quantization for Video Super-Resolution
  • [NeurIPS] Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
  • [NeurIPS] Point4Bit: Post Training 4-bit Quantization for Point Cloud 3D Detection
  • [NeurIPS] PMQ-VE: Progressive Multi-Frame Quantization for Video Enhancement [code] GitHub stars
  • [NeurIPS] VETA-DiT: Variance-Equalized and Temporally Adaptive Quantization for Efficient 4-bit Diffusion Transformers
  • [NeurIPS] LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning [code] GitHub stars
  • [NeurIPS] Efficient Multi-bit Quantization Network Training via Weight Bias Correction and Bit-wise Coreset Sampling
  • [NeurIPS] Efficient and Generalizable Mixed-Precision Quantization via Topological Entropy
  • [NeurIPS] QSCA: Quantization with Self-Compensating Auxiliary for Monocular Depth Estimation
  • [ICCV] Scheduling Weight Transitions for Quantization-Aware Training [code] GitHub stars
  • [ICCV] Task-Specific Zero-shot Quantization-Aware Training for Object Detection [code] GitHub stars
  • [ICCV] OuroMamba: A Data-Free Quantization Framework for Vision Mamba
  • [ICCV] FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization [code] GitHub stars
  • [ICCV] Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers [code] GitHub stars
  • [ICCV] QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation [code] GitHub stars
  • [ICCV] MixA-Q: Revisiting Activation Sparsity for Vision Transformers from a Mixed-Precision Quantization Perspective
  • [ICCV] DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization [code] GitHub stars
  • [ICCV] AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model
  • [ICCV] MSQ: Memory-Efficient Bit Sparsification Quantization
  • [ICCV] QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning [code] GitHub stars
  • [ICML] MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design [code] GitHub stars
  • [ICML] Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
  • [ICML] PARQ: Piecewise-Affine Regularized Quantization [code] GitHub stars
  • [ICML] Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models [code] GitHub stars
  • [ICML] LRA-QViT: Integrating Low-Rank Approximation and Quantization for Robust and Efficient Vision Transformers
  • [ICML] BoA: Attention-aware Post-training Quantization without Backpropagation
  • [ICML] MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance [code] GitHub stars
  • [ICML] NestQuant: nested lattice quantization for matrix products and LLMs
  • [ICML] Q-resafe: Assessing Safety Risks and Quantization-aware Safety Patching for Quantized Large Language Models [code] GitHub stars
  • [ICML] SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression [code] GitHub stars
  • [ICML] QT-DoG: Quantization-Aware Training for Domain Generalization [code] GitHub stars
  • [ICML] Matryoshka Quantization
  • [ICML] Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation [code] GitHub stars
  • [ICML] Layer-wise Quantization for Quantized Optimistic Dual Averaging
  • [ICML] Outlier-Aware Post-Training Quantization for Discrete Graph Diffusion Models
  • [ICML] BlockDialect: Block-wise Fine-grained Mixed Format Quantization for Energy-Efficient LLM Inference
  • [ICML] GPTAQ: Efficient Finetuning-Free Quantization with Asymmetric Calibration [code] GitHub stars
  • [ICML] Optimizing Large Language Model Training Using FP4 Quantization
  • [ICML] SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
  • [ICML] SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization [code] GitHub stars
  • [AAAI] Thinking in Granularity: Dynamic Quantization for Image Super-Resolution by Intriguing Multi-Granularity Clues [code] GitHub stars
  • [AAAI] D2-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models [code] GitHub stars
  • [CVPR] Quantization without Tears
  • [CVPR] APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformer [code] GitHub stars
  • [ICLR] SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning [code] GitHub stars

2024

  • [ICML] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs [code] GitHub stars
  • [ICML] Compressing Large Language Models by Joint Sparsification and Quantization
  • [NeurIPS] BiDM: Pushing the Limit of Quantization for Diffusion Models
  • [ACL Findings] DB-LLM: Accurate Dual-Binarization for Efficient LLMs
  • [NeurIPS] Binarized Diffusion Model for Image Super-Resolution [code] GitHub stars
  • [NeurIPS] 2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution [code] GitHub stars
  • [ICML] Accurate LoRA-Finetuning Quantization of LLMs via Information Retention [code] GitHub stars
  • [ICML] Flexible Residual Binarization for Image Super-Resolution
  • [AAAI] Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge
  • [AAAI] AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries
  • [AAAI] Bi-ViT: Pushing the Limit of Vision Transformer Quantization
  • [AAAI] Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
  • [AAAI] Make RepVGG Greater Again: A Quantization-Aware Approach
  • [AAAI] MetaMix: Meta-State Precision Searcher for Mixed-Precision Activation Quantization
  • [AAAI] Norm Tweaking: High-Performance Low-Bit Quantization of Large Language Models
  • [AAAI] OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
  • [AAAI] PTMQ: Post-training Multi-Bit Quantization of Neural Networks
  • [AAAI] Robustness-Guided Image Synthesis for Data-Free Quantization
  • [AAAI] What Makes Quantization for Large Language Model Hard? An Empirical Study from the Lens of Perturbation
  • [ACL] Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment
  • [ACM MM] Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning Based on Warmup
  • [CVPR] Data-Free Quantization via Pseudo-label Filtering
  • [CVPR] Enhancing Post-training Quantization Calibration through Contrastive Learning
  • [CVPR] Instance-Aware Group Quantization for Vision Transformers
  • [CVPR] Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices
  • [CVPR] PTQ4SAM: Post-Training Quantization for Segment Anything
  • [CVPR] Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector
  • [CVPR] Retraining-Free Model Quantization via One-Shot Weight-Coupling Learning
  • [CVPR] TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
  • [CVPR] Towards Accurate Post-training Quantization for Diffusion Models
  • [ECCV] AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
  • [ECCV] CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
  • [ECCV] Memory-Efficient Fine-Tuning for Quantized Diffusion Model
  • [ECCV] MetaAug: Meta-Data Augmentation for Post-Training Quantization
  • [ECCV] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
  • [ECCV] Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks
  • [ECCV] Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models
  • [ECCV] PQ-SAM: Post-training Quantization for Segment Anything Model
  • [ECCV] Timestep-Aware Correction for Quantized Diffusion Models
  • [ECCV] Towards Robust Full Low-bit Quantization of Super Resolution Networks
  • [EMNLP] ApiQ: Finetuning of 2-Bit Quantized Large Language Model
  • [EMNLP] Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
  • [EMNLP] VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
  • [ICLR] AffineQuant: Affine Transformation Quantization for Large Language Models [code] GitHub stars
  • [ICLR] EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models
  • [ICLR] LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection
  • [ICLR] LoftQ: LoRA-Fine-Tuning-aware Quantization for Large Language Models [code] GitHub stars
  • [ICLR] LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models
  • [ICLR] OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models [code] GitHub stars
  • [ICLR] PB-LLM: Partially Binarized Large Language Models [code] GitHub stars
  • [ICLR] QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models [code] GitHub stars
  • [ICLR] QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
  • [ICLR] Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
  • [ICLR] SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression [code] GitHub stars
  • [ICML] A2Q+: Improving Accumulator-Aware Weight Quantization
  • [ICML] BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization
  • [ICML] ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
  • [ICML] Evaluating Quantized Large Language Models
  • [ICML] Extreme Compression of Large Language Models via Additive Quantization
  • [ICML] FrameQuant: Flexible Low-Bit Quantization for Transformers
  • [ICML] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache [code] GitHub stars
  • [ICML] LQER: Low-Rank Quantization Error Reconstruction for LLMs
  • [ICML] Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
  • [ICML] Sharpness-Aware Data Generation for Zero-shot Quantization
  • [ICML] SqueezeLLM: Dense-and-Sparse Quantization [code] GitHub stars
  • [MLSys] AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration [code] GitHub stars
  • [NeurIPS] BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
  • [NeurIPS] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs
  • [NeurIPS] KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization
  • [NeurIPS] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
  • [NeurIPS] PTQ4DiT: Post-training Quantization for Diffusion Transformers
  • [NeurIPS] Q-VLM: Post-training Quantization for Large Vision-Language Models
  • [NeurIPS] QBB: Quantization with Binary Bases for LLMs
  • [NeurIPS] ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification
  • [ACL Findings] A Comprehensive Evaluation of Quantization Strategies for Large Language Models
  • [ACL Findings] AFPQ: Asymmetric Floating Point Quantization for LLMs [code] GitHub stars
  • [ACL Findings] LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
  • [EMNLP Findings] ATQ: Activation Transformation for Weight-Activation Quantization of LLMs
  • [EMNLP Findings] Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
  • [EMNLP Findings] How Does Quantization Affect Multilingual LLMs?
  • [EMNLP Findings] MobileQuant: Mobile-friendly Quantization for On-device Language Models
  • [EMNLP Findings] QEFT: Quantization for Efficient Fine-Tuning of LLMs
  • [EMNLP Industry] LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit
  • [arXiv] APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models
  • [arXiv] EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs
  • [arXiv] EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge [code] GitHub stars
  • [arXiv] FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
  • [arXiv] GPTVQ: The Blessing of Dimensionality for LLM Quantization [code] GitHub stars
  • [arXiv] IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
  • [arXiv] OneBit: Towards Extremely Low-bit Large Language Models
  • [arXiv] QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs [code] GitHub stars
  • [arXiv] RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization
  • [SIGMOD] RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search [code] GitHub stars
  • [AAAI] One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Training
  • [ICML] Learning from students: Applying t-distributions to explore accurate and efficient formats for llms [code] GitHub stars
  • [ICML] Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
  • [ICML] Reshape and Adapt for Output Quantization (RAOQ): Quantization-aware Training for In-memory Computing Systems
  • [ICML] QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks [code] GitHub stars
  • [NeurIPS] Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers [code] GitHub stars
  • [NeurIPS] MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization [code] GitHub stars
  • [NeurIPS] Exploiting LLM Quantization
  • [NeurIPS] Efficient Multi-task LLM Quantization and Serving for Multiple LoRA Adapters
  • [NeurIPS] QTIP: Quantization with Trellises and Incoherence Processing [code] GitHub stars
  • [NeurIPS] Generalizing CNNs to graphs with learnable neighborhood quantization [code] GitHub stars
  • [NeurIPS] SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training [code] GitHub stars
  • [NeurIPS] Optimal and Approximate Adaptive Stochastic Quantization [code] GitHub stars
  • [NeurIPS] Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models
  • [NeurIPS] StepbaQ: Stepping backward as Correction for Quantized Diffusion Models

2023

  • [ICML] BiBench: Benchmarking and Analyzing Network Binarization [code] GitHub stars
  • [IJCV] Distribution-sensitive Information Retention for Accurate Binary Neural Network
  • [NeurIPS] BiMatting: Efficient Video Matting via Binarization [code] GitHub stars
  • [NeurIPS] QuantSR: Accurate Low-bit Quantization for Efficient Image Super-Resolution [code] GitHub stars
  • [TPAMI] Diverse Sample Generation: Pushing the Limit of Generative Data-Free Quantization [code] GitHub stars
  • [TNNLS] BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance [code] GitHub stars
  • [AAAI] Fast and Accurate Binary Neural Networks Based on Depth-Width Reshaping
  • [AAAI] OMPQ: Orthogonal Mixed Precision Quantization
  • [AAAI] Quantized Feature Distillation for Network Quantization
  • [AAAI] Resilient Binary Neural Network
  • [AAAI] Rethinking Data-Free Quantization as a Zero-Sum Game
  • [ACL] Boost Transformer-based Language Models with GPU-Friendly Sparsity and Quantization
  • [ACL] PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models
  • [CVPR] ABCD : Arbitrary Bitwise Coefficient for De-quantization
  • [CVPR] Adaptive Data-Free Quantization
  • [CVPR] Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization
  • [CVPR] Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
  • [CVPR] GENIE: Show Me the Data for Quantization [code] GitHub stars
  • [CVPR] Hard Sample Matters a Lot in Zero-Shot Quantization
  • [CVPR] NIPQ: Noise proxy-based Integrated Pseudo-Quantization
  • [CVPR] NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
  • [CVPR] One-Shot Model for Mixed-Precision Quantization
  • [CVPR] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric [code] GitHub stars
  • [CVPR] Post-training Quantization on Diffusion Models [code]
  • [CVPR] Q-DETR: An Efficient Low-Bit Quantized Detection Transformer [code] GitHub stars
  • [CVPR] Regularized Vector Quantization for Tokenized Image Synthesis
  • [CVPR] Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective [code] GitHub stars
  • [CVPR] Toward Accurate Post-Training Quantization for Image Super Resolution
  • [EMNLP] LLM-FP4: 4-Bit Floating-Point Quantized Transformers [code] GitHub stars
  • [EMNLP] Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
  • [EMNLP] Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
  • [EMNLP] Watermarking LLMs with Weight Quantization [code] GitHub stars
  • [EMNLP] Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models
  • [ICCV] A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
  • [ICCV] BiViT: Extremely Compressed Binary Vision Transformers
  • [ICCV] Causal-DFQ: Causality Guided Data-Free Network Quantization [code] GitHub stars
  • [ICCV] DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization
  • [ICCV] EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization
  • [ICCV] EQ-Net: Elastic Quantization Neural Networks [code] GitHub stars
  • [ICCV] Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training [code] GitHub stars
  • [ICCV] I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference [code] GitHub stars
  • [ICCV] Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
  • [ICCV] Overcoming Forgetting Catastrophe in Quantization-Aware Training
  • [ICCV] Q-diffusion: Quantizing Diffusion Models [code] GitHub stars
  • [ICCV] QD-BEV: Quantization-aware View-guided Distillation for Multi-view 3D Object Detection
  • [ICCV] RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers [code] GitHub stars
  • [ICCV] Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning
  • [ICLR] Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
  • [ICLR] GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers [code] GitHub stars
  • [ICML] Few-bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction [code] GitHub stars
  • [ICML] FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization [code]
  • [ICML] GPT-Zip: Deep Compression of Finetuned Large Language Models
  • [ICML] Oscillation-free Quantization for Low-bit Vision Transformers [code] GitHub stars
  • [ICML] QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models [code] GitHub stars
  • [ICML] Quantized Distributed Training of Large Models with Convergence Guarantees
  • [ICML] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models [code] GitHub stars
  • [ICML] The case for 4-bit precision: k-bit Inference Scaling Laws
  • [ICML] Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases
  • [ICML] Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases [code] GitHub stars
  • [NeurIPS] Binarized Spectral Compressive Imaging [code] GitHub stars
  • [NeurIPS] Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization
  • [NeurIPS] PackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobile
  • [NeurIPS] PTQD: Accurate Post-Training Quantization for Diffusion Models [code] GitHub stars
  • [NeurIPS] Q-DM: An Efficient Low-bit Quantized Diffusion Model
  • [NeurIPS] QLoRA: Efficient Finetuning of Quantized LLMs [code] GitHub stars
  • [NeurIPS] QuIP: 2-Bit Quantization of Large Language Models With Guarantees [code] GitHub stars
  • [NeurIPS] Temporal Dynamic Quantization for Diffusion Models
  • [NeurIPS] TexQ: Zero-shot Network Quantization with Texture Feature Distribution Calibration
  • [NeurIPS] Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers
  • [TIP] MBFQuant: A Multiplier-Bitwidth-Fixed, Mixed-Precision Quantization Method for Mobile CNN-Based Applications
  • [TPAMI] Optimization-Based Post-Training Quantization With Bit-Split and Stitching
  • [TPAMI] Single-path Bit Sharing for Automatic Loss-aware Model Compression
  • [arXiv] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving [code] GitHub stars
  • [arXiv] Efficient Post-training Quantization with FP8 Formats [code] GitHub stars
  • [arXiv] QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources
  • [arXiv] QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models
  • [arXiv] RPTQ: Reorder-based Post-training Quantization for Large Language Models [code] GitHub stars
  • [arXiv] ZeroQuant-HERO: Hardware-Enhanced Robust Optimized Post-Training Quantization Framework for W8A8 Transformers
  • [AAAI] Quantization-Aware Interval Bound Propagation for Training Certifiably Robust Quantized Neural Networks [code] GitHub stars
  • [ICLR] PowerQuant:Automorphism Search For Non-Uniform Quantization
  • [ICLR] Block and Subword-Scaling Floating-Point (BSFP) : An Efficient Non-Uniform Quantization For Low Precision Inference
  • [NeurIPS] REx: Data-Free Residual Quantization Error Expansion
  • [NeurIPS] Intriguing Properties of Quantization at Scale
  • [NeurIPS] Training Transformers with 4-bit Integers [code] GitHub stars
  • [NeurIPS] Towards Efficient and Accurate Winograd Convolution via Full Quantization
  • [NeurIPS] Pruning vs Quantization: Which is Better? [code] GitHub stars
  • [ICLR] A^2Q: Aggregation-Aware Quantization for Graph Neural Networks

2022

  • [ICLR] BiBERT: Accurate Fully Binarized BERT. code]
  • [IJCAI] BiFSMN: Binary Neural Network for Keyword Spotting [code] GitHub stars
  • [ACM MM] Towards Accurate Post-Training Quantizationfor Vision Transformer
  • [ACL] Compression of Generative Pre-trained Language Models via Quantization
  • [ACM Trans. Des. Autom. Electron. Syst.] Structured Dynamic Precision for Deep Neural Networks uantization
  • [ASE] QVIP: An ILP-based Formal Verification Approach for Quantized Neural Networks
  • [Applied Soft Computing] A neural network compression method based on knowledge-distillation and parameter quantization for the bearing fault diagnosis
  • [CCF Transactions on High Performance Computing] An efficient segmented quantization for graph neural networks
  • [CVPR] A Low Memory Footprint Quantized Neural Network for Depth Completion of Very Sparse Time-of-Flight Depth Maps
  • [CVPR] BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning [code] GitHub stars
  • [CVPR] Data-Free Network Compression via Parametric Non-uniform Mixed Precision Quantization
  • [CVPR] Instance-Aware Dynamic Neural Network Quantization
  • [CVPR] IntraQ: Learning Synthetic Images With Intra-Class Heterogeneity for Zero-Shot Network Quantization [code] GitHub stars
  • [CVPR] It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher [code] GitHub stars
  • [CVPR] Learnable Lookup Table for Neural Network Quantization [code] GitHub stars
  • [CVPR] Mr.BiQ: Post-Training Non-Uniform Quantization based on Minimizing the Reconstruction Error
  • [CVPR] Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation [code] GitHub stars
  • [CVPR] RecDis-SNN: Rectifying Membrane Potential Distribution for Directly Training Spiking Neural Networks
  • [CVPR] Simulated Quantization, Real Power Savings
  • [EANN] A Robust, Quantization-Aware Training Method for Photonic Neural Networks
  • [ECCV] BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks [code] GitHub stars
  • [ECCV] Mixed-Precision Neural Network Quantization via Learned Layer-Wise Importance [code] GitHub stars
  • [ECCV] Neuromorphic Data Augmentation for Training Spiking Neural Networks. [code]
  • [ECCV] Non-Uniform Step Size Quantization for Accurate Post-Training Quantization
  • [ECCV] Patch Similarity Aware Data-Free Quantization for Vision Transformers [code] GitHub stars
  • [ECCV] PTQ4ViT: Post-Training Quantization for Vision Transformers with Twin Uniform Quantization [code] GitHub stars
  • [ECCV] RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization
  • [ECCV] Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
  • [ECCV] Towards Accurate Network Quantization with Equivalent Smooth Regularizer
  • [ECCV] Weight Fixing Networks. [code]
  • [ESE] DiverGet: a Search-Based Software Testing approach for Deep Neural Network Quantization assessment
  • [Electronics] A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration
  • [FPGA] FILM-QNN: Efficient FPGA Acceleration of Deep Neural Networks with Intra-Layer, Mixed-Precision Quantization
  • [ICCRD] Post Training Quantization after Neural Network
  • [ICLR] 8-bit Optimizers via Block-wise Quantization [code] GitHub stars
  • [ICLR] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
  • [ICLR] Information Bottleneck: Exact Analysis of (Quantized) Neural Networks [code] GitHub stars
  • [ICLR] Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks
  • [ICLR] QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization [code] GitHub stars
  • [ICLR] SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation. code]
  • [ICLR] Toward Efficient Low-Precision Training: Data Format Optimization and Hysteresis Quantization
  • [ICLR] VC dimension of partially quantized neural networks in the overparametrized regime
  • [ICML] Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks [code] GitHub stars
  • [ICML] GACT: Activation Compressed Training for Generic Network Architectures [code] GitHub stars
  • [ICML] Overcoming Oscillations in Quantization-Aware Training [code] GitHub stars
  • [ICML] SDQ: Stochastic Differentiable Quantization with Mixed Precision
  • [ICPR] Layer-Wise Data-Free CNN Compression
  • [IEEE Internet of Things Journal] FedQNN: A Computation–Communication-Efficient Federated Learning Framework for IoT With Low-Bitwidth Neural Network Quantization
  • [IJCAI] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer [code] GitHub stars
  • [IJCAI] MultiQuant: Training Once for Multi-bit Quantization of Neural Networks
  • [IJCAI] RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization [code] GitHub stars
  • [IJCNN] Accuracy Evaluation of Transposed Convolution-Based Quantized Neural Networks
  • [IJCV] Distribution-sensitive Information Retention for Accurate Binary Neural Network
  • [IJNS] Convolutional Neural Networks Quantization with Attention
  • [ITSM] Edge–Artificial Intelligence-Powered Parking Surveillance With Quantized Neural Networks
  • [Intelligent Automation & Soft Computing] A Resource-Efficient Convolutional Neural Network Accelerator Using Fine-Grained Logarithmic Quantization
  • [LNAI] ECQ$^x$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs
  • [MICRO] ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization
  • [NeurIPS] BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons [code]
  • [NeurIPS] BiT: Robustly Binarized Multi-distilled Transformer [code] GitHub stars
  • [NeurIPS] ClimbQ: Class Imbalanced Quantization Enabling Robustness on Efficient Inferences
  • [NeurIPS] Entropy-Driven Mixed-Precision Quantization for Deep Network Design
  • [NeurIPS] FP8 Quantization: The Power of the Exponent [code] GitHub stars
  • [NeurIPS] Leveraging Inter-Layer Dependency for Post -Training Quantization
  • [NeurIPS] LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale [code] GitHub stars
  • [NeurIPS] Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning [code] GitHub stars
  • [NeurIPS] Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer [code] GitHub stars
  • [NeurIPS] Redistribution of Weights and Activations for AdderNet Quantization
  • [NeurIPS] Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques
  • [NeurIPS] Towards Efficient Post-training Quantization of Pre-trained Language Models
  • [NeurIPS] ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers [code] GitHub stars
  • [Neural Networks] Quantization-aware training for low precision photonic neural networks
  • [Neurocomputing] EPQuant: A Graph Neural Network compression approach based on product quantization
  • [Ocean Engineering] Neural network based adaptive sliding mode tracking control of autonomous surface vehicles with input quantization and saturation
  • [PPoPP] QGTC: accelerating quantized graph neural networks via GPU tensor core
  • [TCCN] Low-Bitwidth Convolutional Neural Networks for Wireless Interference Identification
  • [TCSVT] An Efficient Implementation of Convolutional Neural Network With CLIP-Q Quantization on FPGA
  • [TGARS] Accelerating Convolutional Neural Network-Based Hyperspectral Image Classification by Step Activation Quantization
  • [TODAES] Dynamic Quantization Range Control for Analog-in-Memory Neural Networks Acceleration
  • [arXiv] Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks
  • [arXiv] Neural network quantization with ai model efficiency toolkit (aimet)
  • [arXiv] Q-ViT: Fully Differentiable Quantization for Vision Transformer
  • [arXiv] QONNX: Representing Arbitrary-Precision Quantized Neural Networks
  • [arXiv] Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
  • [arXiv] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models [code] GitHub stars
  • [arXiv] Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition
  • [tinyML Research Symposium] Power-of-Two Quantization for Low Bitwidth and Hardware Compliant Neural Networks
  • [ECCV] CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution
  • [ECCV] Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach [code] GitHub stars
  • [ECCV] Fine-grained Data Distribution Alignment for Post-Training Quantization [code] GitHub stars
  • [ICML] Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training

2021

  • [CVPR] Diversifying Sample Generation for Accurate Data-Free Quantization
  • [ICLR] BiPointNet: Binary Neural Network for Point Clouds [code] GitHub stars
  • [ICML] How Do Adam and Training Strategies Help BNNs Optimization? [code] GitHub stars
  • [AAAI] Compressing Deep Convolutional Neural Networks by Stacking Low-­Dimensional Binary Convolution Filters
  • [AAAI] Distribution Adaptive INT8 Quantization for Training CNNs
  • [AAAI] FracBits: Mixed Precision Quantization via Fractional Bit-Widths
  • [AAAI] Memory and Computation-Efficient Kernel SVM via Binary Embedding and Ternary Coefficients
  • [AAAI] OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
  • [AAAI] Optimizing Information Theory Based Bitwise Bottlenecks for Efficient Mixed-Precision Activation Quantization
  • [AAAI] Post-­‐training Quantization with Multiple Points: Mixed Precision without Mixed Precision
  • [AAAI] Scalable Verification of Quantized Neural Networks [code] GitHub stars
  • [AAAI] Stochastic Precision Ensemble: Self‐Knowledge Distillation for Quantized Deep Neural Networks
  • [AAAI] TRQ: Ternary Neural Networks with Residual Quantization
  • [AAAI] Uncertainty Quantification in CNN through the Bootstrap of Convex Neural Networks
  • [AAAI] Vector Quantized Bayesian Neural Network Inference for Data Streams
  • [ACL] On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers
  • [ACM MM] Fully Quantized Image Super-Resolution Networks [code] GitHub stars
  • [ACM MM] VQMG: Hierarchical Vector Quantised and Multi-hops Graph Reasoning for Explicit Representation Learning
  • [CVPR] Binary Graph Neural Networks [code] GitHub stars
  • [CVPR] Learnable Companding Quantization for Accurate Low-bit Neural Networks
  • [CVPR] Network Quantization with Element-wise Gradient Scaling [code] GitHub stars
  • [CVPR] Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks [code] GitHub stars
  • [CVPR] PokeBNN: A Binary Pursuit of Lightweight Accuracy [code] GitHub stars
  • [CVPR] S2-bnn: Bridging the gap between self-supervised real and 1-bit neural networks via guided distribution calibration [code] GitHub stars
  • [CVPR] Zero-shot Adversarial Quantization [code] GitHub stars
  • [ECCV] PAMS: Quantized Super-Resolution via Parameterized Max Scale [code] GitHub stars
  • [ICCV] MixMix: All You Need for Data-Free Compression Are Feature and Data Mixing
  • [ICLR] BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction [code] GitHub stars
  • [ICLR] BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization [code] GitHub stars
  • [ICLR] Degree-Quant: Quantization-Aware Training for Graph Neural Networks
  • [ICLR] High-Capacity Expert Binary Networks [code] GitHub stars
  • [ICLR] Incremental few-shot learning via vector quantization in deep embedded space
  • [ICLR] Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network [code] GitHub stars
  • [ICLR] Neural gradients are near-lognormal: improved quantized and sparse training
  • [ICLR] Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks
  • [ICLR] Simple Augmentation Goes a Long Way: ADRL for DNN Quantization
  • [ICLR] Sparse Quantized Spectral Clustering
  • [ICLR] Training with Quantization Noise for Extreme Model Compression [code] GitHub stars
  • [ICLR] WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic
  • [ICML] ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training [code] GitHub stars
  • [ICML] Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators [code] GitHub stars
  • [ICML] Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution
  • [ICML] HAWQ-V3: Dyadic Neural Network Quantization [code] GitHub stars
  • [ICML] I-BERT: Integer-only BERT Quantization [code] GitHub stars
  • [NeurIPS] A Winning Hand: Compressing Deep Networks Can Improve Out-of-Distribution Robustness [code] GitHub stars
  • [NeurIPS] Divergence Frontiers for Generative Models: Sample Complexity, Quantization Effects, and Frontier Integrals
  • [NeurIPS] Post-Training Quantization for Vision Transformer
  • [NeurIPS] Post-Training Sparsity-Aware Quantization [code] GitHub stars
  • [NeurIPS] Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples [code] GitHub stars
  • [NeurIPS] Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes
  • [NeurIPS] VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization
  • [arXiv] A Survey of Quantization Methods for Efficient Neural Network Inference
  • [arXiv] A White Paper on Neural Network Quantization
  • [arXiv] Any-Precision Deep Neural Networks [code] GitHub stars
  • [arXiv] ReCU: Reviving the Dead Weights in Binary Neural Networks [code] GitHub stars
  • [AAAI] Training Binary Neural Network without Batch Normalization for Image Super-Resolution
  • [AAAI] SA-BNN: State-­Aware Binary Neural Network
  • [CVPR] Automated Log-Scale Quantization for Low-Cost Deep Neural Networks
  • [CVPR] QPP: Real-Time Quantization Parameter Prediction for Deep Neural Networks
  • [ICLR] Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming [code] GitHub stars
  • [ICML] Accurate Post Training Quantization With Small Calibration Sets
  • [NeurIPS] BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer

2020

  • [CVPR] Forward and Backward Information Retention for Accurate Binary Neural Networks [code] GitHub stars
  • [PR] Binary neural networks: A survey
  • [AAAI] HLHLp: Quantized Neural Networks Traing for Reaching Flat Minima in Loss Sufrface
  • [AAAI] Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
  • [AAAI] Sparsity-Inducing Binarized Neural Networks
  • [AAAI] Towards Accurate Low Bit-Width Quantization with Multiple Phase Adaptations
  • [ACL] End to End Binarized Neural Networks for Text Classification
  • [COOL CHIPS] A Novel In-DRAM Accelerator Architecture for Binary Neural Network
  • [CVPR] APQ: Joint Search for Network Architecture, Pruning and Quantization Policy [code] GitHub stars
  • [CVPR] BiDet: An Efficient Binarized Object Detector. [code] GitHub stars
  • [CVPR] Fixed-Point Back-Propagation Training
  • [CVPR] GhostNet: More Features from Cheap Operations
  • [CVPR] Low-Bit Quantization Needs Good Distribution
  • [CVPR] Rotation Consistent Margin Loss for Efficient Low-Bit Face Recognition
  • [arXiv] Training Binary Neural Networks using the Bayesian Learning Rule
  • [DATE] BNNsplit: Binarized Neural Networks for embedded distributed FPGA-based computing systems
  • [DATE] OrthrusPE: Runtime Reconfigurable Processing Elements for Binary Neural Networks
  • [DATE] PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones
  • [ECCV] BATS: Binary ArchitecTure Search
  • [ECCV] Differentiable Joint Pruning and Quantization for Hardware Efficiency
  • [ECCV] Generative Low-bitwidth Data Free Quantization [code] GitHub stars
  • [ECCV] Learning Architectures for Binary Networks [code] GitHub stars
  • [ECCV] PROFIT: A Novel Training Method for sub-4-bit MobileNet Models
  • [ECCV] ProxyBNN: Learning Binarized Neural Networks via Proxy Matrices
  • [ECCV] ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions [code] GitHub stars
  • [EMNLP] Fully Quantized Transformer for Machine Translation
  • [EMNLP] TernaryBERT: Distillation-aware Ultra-low Bit BERT [code] GitHub stars
  • [ICASSP] Balanced Binary Neural Networks with Gated Residual
  • [ICET] An Energy-Efficient Bagged Binary Neural Network Accelerator
  • [ICLR] BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations [code] GitHub stars
  • [ICLR] DMS: Differentiable Dimension Search for Binary Neural Networks
  • [ICLR] Learned Step Size Quantization
  • [ICLR] Mixed Precision DNNs: All You Need is a Good Parametrization [code] GitHub stars
  • [ICLR] Training Binary Neural Networks with Real-to-Binary Convolutions
  • [ICML] Accelerating Large-Scale Inference with Anisotropic Vector Quantization
  • [ICML] LSQ+: Improving low-bit quantization through learnable offsets and better initialization
  • [ICML] Training Binary Neural Networks through Learning with Noisy Supervision
  • [ICML] Up or Down? Adaptive Rounding for Post-Training Quantization
  • [IEEE Access] An Energy-Efficient and High Throughput in-Memory Computing Bit-Cell With Excellent Robustness Under Process Variations for Binary Neural Network
  • [IEEE TCS.I] IMAC: In-Memory Multi-Bit Multiplication and ACcumulation in 6T SRAM Array
  • [IEEE TCS.II] A Resource-Efficient Inference Accelerator for Binary Convolutional Neural Networks
  • [IEEE Trans. Electron Devices] Design of High Robustness BNN Inference Accelerator Based on Binary Memristors
  • [IEEE Trans. Magn] SIMBA: A Skyrmionic In-Memory Binary Neural Network Accelerator
  • [IJCAI] CP-NAS: Child-Parent Neural Architecture Search for Binary Neural Networks
  • [IJCAI] Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks
  • [IJCAI] Fully Nested Neural Network for Adaptive Compression and Quantization
  • [IJCAI] Overflow Aware Quantization: Accelerating Neural Network Inference by Low-bit Multiply-Accumulate Operations
  • [IJCAI] Soft Threshold Ternary Networks
  • [IJCAI] Towards Fully 8-bit Integer Inference for the Transformer Model
  • [IJCV] Binarized Neural Architecture Search for Efficient Object Recognition
  • [ISCAS] MuBiNN: Multi-Level Binarized Recurrent Neural Network for EEG Signal Classification
  • [ISQED] BNN Pruning: Pruning Binary Neural Network Guided by Weight Flipping Frequency [code] GitHub stars
  • [MICRO] GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
  • [MLST] Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML
  • [NN] Training high-performance and large-scale deep neural networks with full 8-bit integers
  • [NeurIPS] Adaptive Gradient Quantization for Data-Parallel SGD [code] GitHub stars
  • [NeurIPS] Bayesian Bits: Unifying Quantization and Pruning
  • [NeurIPS] Closing the Dequantization Gap: PixelCNN as a Single-Layer Flow [code] GitHub stars
  • [NeurIPS] Efficient Exact Verification of Binarized Neural Networks [code] GitHub stars
  • [NeurIPS] FleXOR: Trainable Fractional Quantization
  • [NeurIPS] HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
  • [NeurIPS] Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [code] GitHub stars
  • [NeurIPS] Position-based Scaled Gradient for Model Quantization and Pruning [code] GitHub stars
  • [NeurIPS] Robust Quantization: One Model to Rule Them All
  • [NeurIPS] Rotated Binary Neural Network [code] GitHub stars
  • [NeurIPS] Searching for Low-Bit Weights in Quantized Neural Networks [code] GitHub stars
  • [NeurIPS] Universally Quantized Neural Compression
  • [Neurocomputing] Eye localization based on weight binarization cascade convolution neural network
  • [PR Letters] Controlling information capacity of binary neural network
  • [SysML] Riptide: Fast End-to-End Binarized Neural Networks [code] GitHub stars
  • [TPAMI] Deep Neural Network Compression by In-Parallel Pruning-Quantization
  • [TPAMI] Hierarchical Binary CNNs for Landmark Localization with Limited Resources [code]
  • [TPAMI] Towards Efficient U-Nets: A Coupled and Quantized Approach
  • [TVLSI] Phoenix: A Low-Precision Floating-Point Quantization Oriented Architecture for Convolutional Neural Networks
  • [WACV] MoBiNet: A Mobile Binary Network for Image Classification
  • [arXiv] Accelerating Binarized Neural Networks via Bit-Tensor-Cores in Turing GPUs [code] GitHub stars
  • [arXiv] Binarized Graph Neural Network
  • [arXiv] BinaryBERT: Pushing the Limit of BERT Quantization [code] GitHub stars
  • [arXiv] Distillation Guided Residual Learning for Binary Convolutional Neural Networks
  • [arXiv] How Does Batch Normalization Help Binary Training?
  • [arXiv] MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy? [code] GitHub stars
  • [arXiv] RPR: Random Partition Relaxation for Training; Binary and Ternary Weight Neural Networks
  • [arXiv] Training with Quantization Noise for Extreme Model Compression [code] GitHub stars
  • [arXiv] Understanding Learning Dynamics of Binary Neural Networks via Information Bottleneck
  • [paper] Towards Lossless Binary Convolutional Neural Networks Using Piecewise Approximation
  • [CVPR] ZeroQ: A Novel Zero Shot Quantization Framework [code] GitHub stars
  • [CVPR] AdaBits: Neural Network Quantization With Adaptive Bit-Widths [code] GitHub stars
  • [CVPR] Adaptive Loss-aware Quantization for Multi-bit Networks [code] GitHub stars
  • [ECCV] HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs [code] GitHub stars

2019

  • [AAAI] Efficient Quantization for Neural Networks with Binary Weights and Low Bitwidth Activations
  • [AAAI] Projection Convolutional Neural Networks for 1-bit CNNs via Discrete Back Propagation
  • [APCCAS] Using Neuroevolved Binary Neural Networks to solve reinforcement learning environments [code] GitHub stars
  • [BMVC] Accurate and Compact Convolutional Neural Networks with Trained Binarization
  • [BMVC] XNOR-Net++: Improved Binary Neural Networks
  • [CVPR] A Main/Subsidiary Network Framework for Simplifying Binary Neural Network
  • [CVPR] Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?
  • [CVPR] Circulant Binary Convolutional Networks: Enhancing the Performance of 1-bit DCNNs with Circulant Back Propagation
  • [CVPR] Fully Quantized Network for Object Detection
  • [CVPR] HAQ: Hardware-Aware Automated Quantization with Mixed Precision [code] GitHub stars
  • [CVPR] Learning Channel-Wise Interactions for Binary Convolutional Neural Networks
  • [CVPR] Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss
  • [CVPR] Quantization Networks [code] GitHub stars
  • [CVPR] Regularizing Activation Distribution for Training Binarized Deep Networks
  • [CVPR] SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization
  • [CVPR] Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation
  • [arXiv] Back to Simplicity: How to Train Accurate BNNs from Scratch? [code] GitHub stars
  • [arXiv] Binarized Neural Architecture Search
  • [arXiv] Improved training of binary networks for human pose estimation and image recognition
  • [arXiv] Matrix and tensor decompositions for training binary neural networks
  • [arXiv] RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs
  • [arXiv] TentacleNet: A Pseudo-Ensemble Template for Accurate Binary Convolutional Neural Networks
  • [FPGA] Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA
  • [GLSVLSI] Binarized Depthwise Separable Neural Network for Object Tracking in FPGA
  • [ICCV] Bayesian optimized 1-bit cnns
  • [ICCV] Data-Free Quantization Through Weight Equalization and Bias Correction [code] GitHub stars
  • [ICCV] Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks
  • [ICCV] DSConv: Efficient Convolution Operator
  • [ICCV] HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision
  • [ICCV] Searching for Accurate Binary Neural Architectures
  • [ICIP] Training Accurate Binary Neural Networks from Scratch [code] GitHub stars
  • [ICLR] An Empirical study of Binary Neural Networks' Optimisation
  • [ICLR] ProxQuant: Quantized Neural Networks via Proximal Operators [code] GitHub stars
  • [ICML] Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model
  • [ICUS] Balanced Circulant Binary Convolutional Networks
  • [IEEE J. Emerg. Sel. Topics Circuits Syst.] Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine
  • [IEEE J. Solid-State Circuits] An Energy-Efficient Reconfigurable Processor for Binary-and Ternary-Weight Neural Networks With Flexible Data Bit Width
  • [IEEE JETC] Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices
  • [IEEE TCS.I] Recursive Binary Neural Network Training Model for Efficient Usage of On-Chip Memory
  • [IEEE TCS.I] Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays
  • [IJCAI] Binarized Collaborative Filtering with Distilling Graph Convolutional Network
  • [IJCAI] Binarized Neural Networks for Resource-Efficient Hashing with Minimizing Quantization Loss
  • [ISOCC] Dual Path Binary Neural Network
  • [MDPI Electronics] A Review of Binarized Neural Networks
  • [NeurIPS] Fully Quantized Transformer for Improved Translation
  • [NeurIPS] Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization [code] GitHub stars
  • [NeurIPS] MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization [code] GitHub stars
  • [NeurIPS] Model Compression with Adversarial Robustness: A Unified Optimization Framework
  • [NeurIPS] Normalization Helps Training of Quantized LSTM
  • [NeurIPS] Q8BERT: Quantized 8Bit BERT
  • [NeurIPS] Regularized Binary Network Training
  • [RoEduNet] PXNOR: Perturbative Binary Neural Network [code] GitHub stars
  • [SiPS] Knowledge distillation for optimization of quantized deep neural networks
  • [TMM] Compact Hash Code Learning With Binary Deep Neural Network
  • [TMM] Deep Binary Reconstruction for Cross-Modal Hashing
  • [VLSI-SoC] A Product Engine for Energy-Efficient Execution of Binary Neural Networks Using Resistive Memories
  • [arXiv] daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices [code] GitHub stars
  • [arXiv] Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search
  • [arXiv] QKD: Quantization-aware Knowledge Distillation
  • [arXiv] Self-Binarizing Networks
  • [arXiv] Towards Unified INT8 Training for Convolutional Neural Network
  • [paper] BNN+: Improved Binary Network Training

2018

  • [AAAI] Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM [code]
  • [AAAI] From Hashing to CNNs: Training BinaryWeight Networks via Hashing
  • [CAAI] Fast object detection based on binary deep convolution neural networks
  • [CVPR] Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations
  • [CVPR] Explicit loss-error-aware quantization for low-bit deep neural networks
  • [CVPR] Modulated convolutional networks
  • [CVPR] Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
  • [CVPR] SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks [code]
  • [CVPR] Towards Effective Low-bitwidth Convolutional Neural Networks
  • [CVPR] Two-Step Quantization for Low-bit Neural Networks
  • [arXiv] BinaryRelax: A Relaxation Approach For Training Deep Neural Networks With Quantized Weights
  • [arXiv] LightNN: Filling the Gap between Conventional Deep Neural Networks and Binarized Networks
  • [ECCV] Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm [code] GitHub stars
  • [ECCV] LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks [code] GitHub stars
  • [ECCV] Quantization Mimic: Towards Very Tiny CNN for Object Detection
  • [ECCV] TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights [code] GitHub stars
  • [ECCV] Training Binary Weight Networks via Semi-Binary Decomposition
  • [FCCM] ReBNet: Residual Binarized Neural Network [code] GitHub stars
  • [FPL] FBNA: A Fully Binarized Neural Network Accelerator
  • [ICLR] Analysis of Quantized Models
  • [ICLR] Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy
  • [ICLR] Loss-aware Weight Quantization of Deep Networks [code] GitHub stars
  • [ICLR] Model compression via distillation and quantization [code] GitHub stars
  • [ICLR] PACT: Parameterized Clipping Activation for Quantized Neural Networks
  • [ICLR] WRPN: Wide Reduced-Precision Networks
  • [IEEE J. Solid-State Circuits] BRein Memory: A Single-Chip Binary/Ternary Reconfigurable in-Memory Deep Neural Network Accelerator Achieving 1.4 TOPS at 0.6 W
  • [IJCAI] Deterministic Binary Filters for Convolutional Neural Networks
  • [IJCAI] Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models
  • [IJCNN] Analysis and Implementation of Simple Dynamic Binary Neural Networks
  • [IPDPS] BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU
  • [MM] BitStream: Efficient Computing Architecture for Real-Time Low-Power Inference of Binary Neural Networks on CPUs
  • [NCA] A survey of FPGA-based accelerators for convolutional neural networks
  • [NeurIPS] Scalable methods for 8-bit training of neural networks [code] GitHub stars
  • [NeurIPS] Training Deep Neural Networks with 8-bit Floating Point Numbers
  • [Res Math Sci] Blended coarse gradient descent for full quantization of deep neural networks
  • [TCAD] XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference
  • [TRETS] FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks
  • [TVLSI] An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks
  • [arXiv] Joint Neural Architecture Search and Quantization [code] GitHub stars
  • [arXiv] Training Competitive Binary Neural Networks from Scratch [code] GitHub stars

2017

  • [CVPR] Deep Learning with Low Precision by Half-wave Gaussian Quantization [code] GitHub stars
  • [CVPR] Local Binary Convolutional Neural Networks [code] GitHub stars
  • [arXiv] BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet [code]
  • [FPGA] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference
  • [ICASSP] Fixed-point optimization of deep neural networks with adaptive step size retraining
  • [ICCV] Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources [code]
  • [ICCV] Performance Guaranteed Network Acceleration via High-Order Residual Quantization
  • [ICLR] Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights [code] GitHub stars
  • [ICLR] Loss-aware Binarization of Deep Networks [code] GitHub stars
  • [ICLR] Soft Weight-Sharing for Neural Network Compression
  • [ICLR] Trained Ternary Quantization [code] GitHub stars
  • [IPDPSW] On-Chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGA
  • [InterSpeech] Binary Deep Neural Networks for Speech Recognition
  • [JETC] A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks
  • [MWSCAS] Deep learning binary neural network on an FPGA
  • [NeurIPS] Towards Accurate Binary Convolutional Neural Network [code] GitHub stars
  • [Neurocomputing] FP-BNN: Binarized neural network on FPGA
  • [arXiv] ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks [code] GitHub stars
  • [arXiv] Ternary Neural Networks with Fine-Grained Quantization

2016

  • [CVPR] Quantized convolutional neural networks for mobile devices. code
  • [arXiv] DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients [code] GitHub stars
  • [ECCV] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks [code] GitHub stars
  • [ICASSP] Fixed-point Performance Analysis of Recurrent Neural Networks
  • [NeurIPS] Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 [code] GitHub stars
  • [NeurIPS] Ternary weight networks [code] GitHub stars

2015

  • [ICML] Bitwise Neural Networks
  • [NeurIPS] BinaryConnect: Training Deep Neural Networks with binary weights during propagations [code] GitHub stars
  • [arXiv] Resiliency of Deep Neural Networks under quantizations

Related Repositories

Star History

Star History Chart

About

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors