aboutsummaryrefslogtreecommitdiff
path: root/arm_compute/runtime/NEON/functions/NEGEMMConvolutionLayer.h
diff options
context:
space:
mode:
authorGeorgios Pinitas <georgios.pinitas@arm.com>2021-02-15 20:42:39 +0000
committerGiorgio Arena <giorgio.arena@arm.com>2021-02-16 11:59:52 +0000
commit274733dbd323321a9c09668e4f60396bef150e39 (patch)
tree46395a6fe0d99aecbb2b17d99802b4f134637252 /arm_compute/runtime/NEON/functions/NEGEMMConvolutionLayer.h
parentcab1ab92813a346779bacd728ef8d7d4159abac6 (diff)
downloadComputeLibrary-274733dbd323321a9c09668e4f60396bef150e39.tar.gz
Handle Conv2d layer with implicit output padding in NHWC
Corner cases exist when output top/bottom padding is non-zero for Convolution Layer. This can cause invalid output from the NEGEMMConvolutionLayer as assembly kernel integration does not efficiently handles such cases. As a workaround we always allocate a memory-managed auxiliary tensor which we use as an output for GEMM when padding exists and then we copy to the padded output. If no padding exists we import the output tensor memory to the temporary buffer and perform calculation as we did before. Resolves: COMPMID-4114 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: If82d0e115b8369b91d775895d5315b044306cc74 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5083 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'arm_compute/runtime/NEON/functions/NEGEMMConvolutionLayer.h')
-rw-r--r--arm_compute/runtime/NEON/functions/NEGEMMConvolutionLayer.h2
1 files changed, 2 insertions, 0 deletions
diff --git a/arm_compute/runtime/NEON/functions/NEGEMMConvolutionLayer.h b/arm_compute/runtime/NEON/functions/NEGEMMConvolutionLayer.h
index aadc429864..65c2ef7e0b 100644
--- a/arm_compute/runtime/NEON/functions/NEGEMMConvolutionLayer.h
+++ b/arm_compute/runtime/NEON/functions/NEGEMMConvolutionLayer.h
@@ -275,10 +275,12 @@ private:
NEReshapeLayer _reshape_layer;
const ITensor *_original_weights;
+ const ITensor *_original_output;
Tensor _im2col_output;
Tensor _weights_reshaped;
Tensor _gemm_output;
+ Tensor _gemm_output_3d;
Tensor _tmp_output;
DataLayout _data_layout;