update qat doc. #2366

lkk12014402 · 2025-12-16T07:15:40Z

Description

update qat example/api doc.

Copilot

Pull request overview

This PR updates the QAT (Quantization-Aware Training) documentation and examples by removing an outdated quantization script and updating the README with corrected instructions.

Key Changes:

Removed the standalone quantize_autoround.py script
Updated README documentation to reference the centralized auto_round example instead

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
quantize_autoround.py	Removed outdated standalone quantization script
README.md	Updated Step 2 instructions with corrected command and reference to centralized auto_round example

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-16T07:16:16Z

examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm_qat/README.md

 ##### Step 2: 

-Save the model directly to a get post training quantization model with using [auto-round](https://github.com/intel/auto-round).
+Save the model directly to a get post training quantization model with following this example [auto_round


The phrase 'to a get post training' contains grammatical errors. It should be 'to get a post-training' or 'to get post-training'.

Suggested change

Save the model directly to a get post training quantization model with following this example [auto_round

Save the model directly to get a post-training quantization model by following this example [auto_round

examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm_qat/README.md

Signed-off-by: lkk <[email protected]>

…tization/llm_qat/README.md Co-authored-by: Copilot <[email protected]>

for more information, see https://pre-commit.ci

chensuyue · 2025-12-16T14:21:08Z

examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm_qat/README.md


 ```
-python quantize_autoround.py 
+CUDA_VISIBLE_DEVICES=0 python ../auto_round/quantize.py  \


Do you mean this script? https://github.com/lkk12014402/neural-compressor/blob/3d3aca6289be5ad4f5af18cd72f2eca12a11f633/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/llama3/quantize.py

yiliu30 · 2025-12-17T00:51:07Z

docs/source/3x/PT_QAT.md

+
+This section walks through an end-to-end example based on the provided code and examples in:
+
+`examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm_qat/`


It would be better to add a link.

yiliu30 · 2025-12-17T00:51:29Z

docs/source/3x/PT_QAT.md

+
+`requirements.txt` includes (among others):
+
+- `auto-round==0.8.0`


yiliu30 · 2025-12-17T00:53:14Z

docs/source/3x/PT_QAT.md

+  --model vllm \
+  --model_args pretrained=./llama3.1-finetuned-qat,\
+tensor_parallel_size=1,data_parallel_size=1,\
+gpu_memory_utilization=0.3,max_model_len=32768,enforce_eager=True \


gpu_memory_utilization is quite low,
enforce_eager cause poor perf.

yiliu30 · 2025-12-17T00:55:13Z

docs/source/3x/PT_QAT.md

+    eval_size: int = 0
+```
+
+4. **QuantizationArguments**


Duplicate QuantizationArguments with the one at L144?

update qat doc.

28c1907

Signed-off-by: lkk <[email protected]>

lkk12014402 requested review from chensuyue, Copilot and yiliu30 December 16, 2025 07:15

Copilot AI reviewed Dec 16, 2025

View reviewed changes

lkk12014402 and others added 4 commits December 16, 2025 07:26

add qat doc.

70ff892

Signed-off-by: lkk <[email protected]>

Update examples/pytorch/nlp/huggingface_models/language-modeling/quan…

12d627a

…tization/llm_qat/README.md Co-authored-by: Copilot <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

180ee57

for more information, see https://pre-commit.ci

Update PT_QAT.md

3d3aca6

lkk12014402 added this to the 3.7 milestone Dec 16, 2025

chensuyue reviewed Dec 16, 2025

View reviewed changes

yiliu30 requested a review from xin3he December 17, 2025 00:48

yiliu30 reviewed Dec 17, 2025

View reviewed changes

yiliu30 requested changes Dec 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

update qat doc. #2366

update qat doc. #2366

lkk12014402 commented Dec 16, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Uh oh!

chensuyue Dec 16, 2025

Uh oh!

yiliu30 Dec 17, 2025

Uh oh!

yiliu30 Dec 17, 2025

Uh oh!

yiliu30 Dec 17, 2025

Uh oh!

yiliu30 Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	Save the model directly to a get post training quantization model with following this example [auto_round
	Save the model directly to get a post-training quantization model by following this example [auto_round


		This section walks through an end-to-end example based on the provided code and examples in:

		`examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm_qat/`


		`requirements.txt` includes (among others):

		- `auto-round==0.8.0`

update qat doc. #2366

Are you sure you want to change the base?

update qat doc. #2366

Conversation

lkk12014402 commented Dec 16, 2025

Description

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chensuyue Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

yiliu30 Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

yiliu30 Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

yiliu30 Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

yiliu30 Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants