Skip to content

Commit

Permalink
Update doc and code to run quantized model (#157)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #157

- Fix doc to separte 1) generating quantized model and b) running it with
  executor_runner
- Include <tuple> in chose_qparams
- Include quantized ops by default in executor_runner

Reviewed By: larryliu0820, guangy10

Differential Revision: D48752106

fbshipit-source-id: 30f4e7ba121abeb01b7b97020c2fef0f5d2ac891
  • Loading branch information
kimishpatel authored and facebook-github-bot committed Aug 29, 2023
1 parent b9f37cc commit 1f88ff4
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 6 deletions.
14 changes: 13 additions & 1 deletion examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,9 @@ buck2 run examples/executor_runner:executor_runner -- --model_path mv2.pte
## Quantization
Here is the [Quantization Flow Docs](/docs/website/docs/tutorials/quantization_flow.md).

You can run quantization test with the following command:
### Generating quantized model

You can generate quantized model with the following command (following example is for mv2, aka MobileNetV2):
```bash
python3 -m examples.quantization.example --model_name "mv2" --so-library "<path/to/so/lib>" # for MobileNetv2
```
Expand All @@ -80,6 +82,16 @@ you can also find the valid quantized example models by running:
buck2 run executorch/examples/quantization:example -- --help
```

### Running quantized model

Quantized model can be run via executor_runner, similar to floating point model, via, as shown above:

```bash
buck2 run examples/executor_runner:executor_runner -- --model_path mv2.pte
```

Note that, running quantized model, requires various quantized/dequantize operators, available in [quantized kernel lib](/kernels/quantized).

## XNNPACK Backend
Please see [Backend README](backend/README) for XNNPACK quantization, export, and run workflow.

Expand Down
6 changes: 3 additions & 3 deletions examples/executor_runner/targets.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,13 @@ def define_common_targets():

register_custom_op = native.read_config("executorch", "register_custom_op", "0")
register_quantized_ops = native.read_config("executorch", "register_quantized_ops", "0")
custom_ops_lib = []

# Include quantized ops to be able to run quantized model with portable ops
custom_ops_lib = ["//executorch/kernels/quantized:generated_lib"]
if register_custom_op == "1":
custom_ops_lib.append("//executorch/examples/custom_ops:lib_1")
elif register_custom_op == "2":
custom_ops_lib.append("//executorch/examples/custom_ops:lib_2")
if register_quantized_ops == "1":
custom_ops_lib.append("//executorch/kernels/quantized:generated_lib")

# Test driver for models, uses all portable kernels and a demo backend. This
# is intended to have minimal dependencies. If you want a runner that links
Expand Down
3 changes: 1 addition & 2 deletions examples/quantization/test_quantize.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,7 @@ test_buck2_quantization() {
${PYTHON_EXECUTABLE} -m "examples.quantization.example" --so_library="$SO_LIB" --model_name="$1"

echo 'Running executor_runner'
buck2 run //examples/executor_runner:executor_runner \
--config=executorch.register_quantized_ops=1 -- --model_path="./$1.pte"
buck2 run //examples/executor_runner:executor_runner -- --model_path="./$1.pte"
# should give correct result

echo "Removing $1.pte"
Expand Down
1 change: 1 addition & 0 deletions kernels/quantized/cpu/op_choose_qparams.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include <algorithm>
#include <cinttypes>
#include <cmath>
#include <tuple>
/**
* For an input tensor, use the scale and zero_point arguments to quantize it.
*/
Expand Down

0 comments on commit 1f88ff4

Please sign in to comment.