diff --git a/vision/classification/resnet/README.md b/vision/classification/resnet/README.md index 10454c05e..d91b040d7 100644 --- a/vision/classification/resnet/README.md +++ b/vision/classification/resnet/README.md @@ -70,7 +70,7 @@ The inference was done using jpeg image. ### Preprocessing The image needs to be preprocessed before fed to the network. -The first step is to extract a 224x224 crop from the center of the image. For this, the image is first scaled to a minimum size of 256x256, while keeping aspect ratio. That is, the shortest side of the image is resized to 256 and the other side is scaled accordingly to maintain the original aspect ratio. After that, the image is normalized with mean = 255*[0.485, 0.456, 0.406] and std = 255*[0.229, 0.224, 0.225]. Last step is to transpose it from HWC to CHW layout. +The first step is to decode the image. Next, we extract a 224x224 crop from the center of the image. For this, the image is first scaled to a minimum size of 256x256, while keeping aspect ratio. That is, the shortest side of the image is resized to 256 and the other side is scaled accordingly to maintain the original aspect ratio. After that, the image is normalized with mean = 255*[0.485, 0.456, 0.406] and std = 255*[0.229, 0.224, 0.225]. Last step is to transpose it from HWC to CHW layout. The described preprocessing steps can be represented with an ONNX model: ```python @@ -81,22 +81,23 @@ from onnx import checker resnet_preproc = parser.parse_model(''' < ir_version: 8, - opset_import: [ "" : 18, "local" : 1 ], + opset_import: [ "" : 20, "local" : 1 ], metadata_props: [ "preprocessing_fn" : "local.preprocess"] > -resnet_preproc_g (seq(uint8[?, ?, 3]) images) => (float[B, 3, 224, 224] preproc_data) +resnet_preproc_g (seq(uint8[?]) images) => (float[B, 3, 224, 224] preproc_data) { preproc_data = local.preprocess(images) } < - opset_import: [ "" : 18 ], + opset_import: [ "" : 20 ], domain: "local", - doc_string: "Preprocessing function." + doc_string: "Preprocessing function, including image decoding." > preprocess (input_batch) => (output_tensor) { tmp_seq = SequenceMap < - body = sample_preprocessing(uint8[?, ?, 3] sample_in) => (float[3, 224, 224] sample_out) { + body = sample_preprocessing(uint8[?] sample_in) => (float[3, 224, 224] sample_out) { + image = ImageDecoder (sample_in) target_size = Constant () image_resized = Resize