[OpenVINO Quantizer] Update OpenVINO Quantizer Tutorial by anzr299 · Pull Request #3889 · pytorch/tutorials

anzr299 · 2026-05-11T13:13:21Z

Description

Recently the Openvino quantizer was moved from nncf -> executorch. This would break the imports mentioned in this tutorial.
This PR fixes the imports to use executorch.

Checklist

The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
[x ] Only one issue is addressed in this pull request
Labels from the issue that this PR is fixing are added to this pull request
[ x] No unnecessary issues are included into this pull request.

pytorch-bot · 2026-05-11T13:13:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3889

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

anzr299 · 2026-05-11T13:21:46Z

-* ``target_device`` - defines the target device, the specificity of which will be taken into account during optimization. The following values are supported: ``ANY`` (default), ``CPU``, ``CPU_SPR``, ``GPU``, and ``NPU``.
-
-    .. code-block:: python
-
-        OpenVINOQuantizer(target_device=nncf.TargetDevice.CPU)
-


@daniil-lyakhov I couldn't find the target_device being used inside of openvino quantizer.

daniil-lyakhov · 2026-05-11T13:34:22Z

+* ``mode`` - defines quantization scheme for the model. Multiple modes are supported:

-    * ``PERFORMANCE`` (default) - defines symmetric quantization of weights and activations
+    * ``INT8_SYM`` (default) - defines symmetric quantization of weights and activations. This is the best for performance


Why leave weight compression behind? Can we extend the example with WC?

In the optional part? I agree I will add it there.
Also, maybe we can change the link which points to some example for PTQ in executorch like yolo instead of nncf resnet example. What do you think?

What do you mean this is the best for performance? Unclear

daniil-lyakhov · 2026-05-18T14:30:05Z

    float_model(Python)                          Example Input
        \                                              /
         \                                            /
-    —--------------------------------------------------------


daniil-lyakhov · 2026-05-18T14:30:39Z

+* ``mode`` - defines quantization scheme for the model. Multiple modes are supported:

-    * ``PERFORMANCE`` (default) - defines symmetric quantization of weights and activations
+    * ``INT8_SYM`` (default) - defines symmetric quantization of weights and activations. This is the best for performance


What do you mean this is the best for performance? Unclear

daniil-lyakhov · 2026-05-18T14:32:26Z

        exported_model, quantizer, calibration_dataset, smooth_quant=True, fast_bias_correction=False
    )

+Weights Only Quantization


Suggested change

Weights Only Quantization

Weights Only Compression

daniil-lyakhov · 2026-05-18T14:34:10Z

+Data-free algorithms
+~~~~~~~~~~~~~~~~~~~~
+
+When no calibration data is available, ``compress_pt2e`` can perform weight compression relying solely on the pretrained weights. Data-Free Compression uses only the weight tensor statistics, with no activations observed at any point. It can be combined with the AWQ and Mixed Precision algorithms when richer behavior is needed without giving up the no-dataset workflow.
+
+.. code-block:: python
+
+    from nncf.experimental.torch.fx import compress_pt2e
+
+    compressed_model = compress_pt2e(exported_model, quantizer, awq=True, ratio=0.8)
+
+Mixed Precision algorithms
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Mixed Precision assigns different bit-widths (e.g. INT4 vs INT8) to individual layers based on their sensitivity, keeping more sensitive layers at higher precision while aggressively compressing the rest. NNCF supports several sensitivity-ranking criteria:
+
+- **Weight Quantization Error** - Data-free metric that measures the per-layer error introduced by quantizing the weights themselves, requiring no calibration data.
+- **Hessian** - Activation-aware metric that uses second-order information about the loss to estimate how much the model output changes when a layer's weights are perturbed by quantization.
+- **Mean Variance** and **Max Variance** - Activation-aware metrics that rank layers by the mean or maximum variance of their input activations, on the intuition that layers with more spread-out activations are harder to quantize.
+- **Mean Magnitude** - Activation-aware metric that ranks layers by the average magnitude of their input activations.


Too much characters, please add it before the AWQ/scale estimation in the same heirarchy. And put a comment in the code against the calibration dataset that it is optional for the data-free mode

daniil-lyakhov · 2026-05-18T14:34:45Z

+- **Weight Quantization Error** - Data-free metric that measures the per-layer error introduced by quantizing the weights themselves, requiring no calibration data.
+- **Hessian** - Activation-aware metric that uses second-order information about the loss to estimate how much the model output changes when a layer's weights are perturbed by quantization.
+- **Mean Variance** and **Max Variance** - Activation-aware metrics that rank layers by the mean or maximum variance of their input activations, on the intuition that layers with more spread-out activations are harder to quantize.


Do we have a doc with all the details? I would prefere a link here

Update openvino_quantizer.rst

ba70b4a

meta-cla Bot added the cla signed label May 11, 2026

anzr299 added 3 commits May 11, 2026 17:17

Update openvino_quantizer.rst

d89f3e8

Update openvino_quantizer.rst

d652ee0

update ovquantizer location in executorch

695bd7d

anzr299 commented May 11, 2026

View reviewed changes

anzr299 marked this pull request as draft May 11, 2026 13:22

daniil-lyakhov reviewed May 11, 2026

View reviewed changes

Update openvino_quantizer.rst

6397dec

svekars added the openvino label May 12, 2026

anzr299 added 2 commits May 18, 2026 01:28

Update openvino_quantizer.rst

96a8dee

Merge branch 'main' into patch-1

90fde84

daniil-lyakhov suggested changes May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenVINO Quantizer] Update OpenVINO Quantizer Tutorial#3889

[OpenVINO Quantizer] Update OpenVINO Quantizer Tutorial#3889
anzr299 wants to merge 7 commits into
pytorch:mainfrom
anzr299:patch-1

anzr299 commented May 11, 2026

Uh oh!

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

Uh oh!

anzr299 May 11, 2026

Uh oh!

daniil-lyakhov May 11, 2026

Uh oh!

anzr299 May 11, 2026

Uh oh!

daniil-lyakhov May 11, 2026

Uh oh!

daniil-lyakhov May 18, 2026

Uh oh!

daniil-lyakhov May 18, 2026

Uh oh!

daniil-lyakhov May 18, 2026

Uh oh!

daniil-lyakhov May 18, 2026

Uh oh!

daniil-lyakhov May 18, 2026

Uh oh!

daniil-lyakhov May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

anzr299 commented May 11, 2026

Description

Checklist

Uh oh!

pytorch-bot Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3889

❗ 2 Active SEVs

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot Bot commented May 11, 2026 •

edited

Loading