Become a leader in the IoT community!
New DevHeads get a 320-point leaderboard boost when joining the DevHeads IoT Integration Community. In addition to learning and advising, active community leaders are rewarded with community recognition and free tech stuff. Start your Legendary Collaboration now!
Instead of fully quantizing the model to `INT8`, you can use `mixed precision quantization`. This approach leaves unsupported layers like `Depthwise Conv2D` in `float32 FP32` while quantizing the rest of the model to `INT8`
For `TensorFlow Lite`, you can specify dynamic range quantization for unsupported layers. See how you can adjust your conversion script:
“`
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, # INT8 quantized ops
tf.lite.OpsSet.TFLITE_BUILTINS] # FP32 fallback for unsupported layers
converter.inference_input_type = tf.int8 # Input quantized as int8
converter.inference_output_type = tf.int8 # Output quantized as int8
tflite_model = converter.convert()
“`
CONTRIBUTE TO THIS THREAD