-
Notifications
You must be signed in to change notification settings - Fork 478
PQuant🌶️ integration #1362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
PQuant🌶️ integration #1362
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also checkout #1338 so that the bit-exact pass also applies for pquant models - In short, if the quantizers are converted to FixedPointQuantzier
object, or linear layers with quantization precision set properly with trusted=True
, full model precision propagation (this file) can be used with little changes and all precision inference should work (will need to add rules for the multiplier relu function but likely nothing else).
If you don't want to use that pass, maybe consider deciding if users should use config_from_keras_model
or not. If not, maybe leave it untouched and do enforcement in a dedicated pass.
@@ -33,6 +33,10 @@ def handle( | |||
'n_out': n_out, | |||
'n_in': n_in, | |||
} | |||
|
|||
if hasattr(layer, 'quantization_parameters'): | |||
config['quantization_parameters'] = layer.quantization_parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not part of standard keras 3 property, should be in your pquant handler (maybe a base class for all pquant layers?)
config['shift'] = 0.5 | ||
# Quartus seems to have trouble if the width is 1. | ||
config['slope_prec'] = FixedPrecisionType(width=2, integer=0, signed=False) | ||
config['shift_prec'] = FixedPrecisionType(width=2, integer=0, signed=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
width should be 1 here
|
||
|
||
@register | ||
class PQuantPoolingHandler(KerasV3LayerHandler): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly the same as the original pooling handler. Can reuse some code there I think?
@@ -141,4 +141,7 @@ def handle( | |||
elif isinstance(layer, BaseConv): | |||
config['weight_data'] = kernel | |||
|
|||
if hasattr(layer, 'quantization_parameters'): | |||
config['quantization_parameters'] = layer.quantization_parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, should be in pquant base class
@@ -44,6 +44,10 @@ def parse_conv1d_layer(operation, layer_name, input_names, input_shapes, node, c | |||
|
|||
output_shape = [input_shapes[0][0], layer['n_filt'], layer['out_width']] # Channel first as default | |||
|
|||
# Quantization parameter for PQuant integration | |||
if hasattr(class_object, "quantization_parameters"): | |||
layer['quantization_parameters'] = class_object.quantization_parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the handler is functional you can't really subclassing it so fine. Though, all quantization parameters saved here can be derived from the values themselves so not strictly necessary....
@@ -352,7 +352,10 @@ def parse_pytorch_model(config, verbose=True): | |||
if '.' not in node.target: | |||
obj = getattr(model, node.name) | |||
else: | |||
obj = getattr(children[node.target.split('.')[0], node.name]) | |||
if '_' not in node.name: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment explaining why this is needed
if (datareg > 0) { | ||
|
||
if (mul[0] >= 0) | ||
res[ii] = datareg << mul[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<< -3
is legal syntax. Why differentiate +/- cases?
# TODO In the next version of this function, these should not be exposed to user to tweak | ||
layer_config['Precision'][pname] = str(precision) | ||
# PQuant quantization | ||
if 'quantization_parameters' in layer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused here. There are three ways to handle quantzation parameters :
- QKeras: parser doesn't say anything about the precision used (actually, some, but almost no), quantization configuration are passed through config generated by
config_from_model
- (old) HGQ1: a opt pass enforces the quantization config embedded in the model, by default
config_from_model
does nothing and is not expected to be used - (current) HGQ1/2: a pass infers layer precision config from the weights/activation quantizers w/o explicit passing them.
config_from_model
still does nothing and not expect to be used
Since you are embedding config to parsed layer dictionary, I am assuming you are going with way 2, but why adding config_from_model
precision here? Are the users expected to use config_from_model
when converting pquant models? If the two conflicts after user modification, which takes priority?
Description
This draft PR is intended to start a discussion on how to best integrate the PQuant🌶️ library with hls4ml.
The current state of the PR supports layers pruned and quantized using fixed quantizers. Support for high granularity quantization is still in development.
Type of change
Tests
Unit test
⚠️ ⚠️ THE TEST IS CURRENTLY BROKEN! The file needs to be split into two, to test PyTorch and Keras frontend separately ⚠️ ⚠️
test/pytest/test_pquant.py
was added to test correct parsing of the pruned/quantized layers and that the precision is set correctly.Checklist
pre-commit
on the files I edited or added.