diff options
author | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2021-08-11 14:06:28 +0100 |
---|---|---|
committer | Giorgio Arena <giorgio.arena@arm.com> | 2021-08-11 15:08:28 +0000 |
commit | d761a3e3c153083cd3843fe686f27e3438c87d1c (patch) | |
tree | 27046940a8ff7faa33374982f695d216212840c1 /src | |
parent | 288d3cb4beb7bbfdb2f8ce2811a07bf285a00d21 (diff) | |
download | ComputeLibrary-d761a3e3c153083cd3843fe686f27e3438c87d1c.tar.gz |
Fix performance regression due to clFinish()
- In ClGemmLowpMatrixMultiplyCore::prepare we always called clFinish()
also when the workload was already prepared
Resolves COMPMID-4707
Change-Id: Icdcee528590e2c5efb75325a80c2a45ec84993d1
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6082
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Diffstat (limited to 'src')
-rw-r--r-- | src/runtime/gpu/cl/operators/ClGemmLowpMatrixMultiplyCore.cpp | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/src/runtime/gpu/cl/operators/ClGemmLowpMatrixMultiplyCore.cpp b/src/runtime/gpu/cl/operators/ClGemmLowpMatrixMultiplyCore.cpp index 64c8743f13..0c72912642 100644 --- a/src/runtime/gpu/cl/operators/ClGemmLowpMatrixMultiplyCore.cpp +++ b/src/runtime/gpu/cl/operators/ClGemmLowpMatrixMultiplyCore.cpp @@ -773,9 +773,9 @@ void ClGemmLowpMatrixMultiplyCore::prepare(ITensorPack &tensors) shifts_tensor->unmap(CLScheduler::get().queue()); } } + CLScheduler::get().queue().finish(); _is_prepared = true; } - CLScheduler::get().queue().finish(); } experimental::MemoryRequirements ClGemmLowpMatrixMultiplyCore::workspace() const |