Port CLGEMM to memory injecting interface

Moves the following kernels: - CLGEMMMatrixMultiplyKernel - CLGEMMMatrixMultiplyNativeKernel - CLGEMMMatrixMultipluReshapedKernel - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel Moves the following functions - CLGEMM Introduces facilities to easy handling of auxiliary temporary buffers under then new run interface. Such are: - CLAuxTensorHandler: That allows wrapping of workspace buffers memory to CLBuffer objects - Ability to inject TensorInfo to allocator without transferring ownership. This reduce the copy overhead if needed. Resolves: COMPMID-4188 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I7055435d831b05b749b26302082e4ac45f26dfb0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5498 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
author: Georgios Pinitas <georgios.pinitas@arm.com> 2021-04-22 21:13:21 +0100
committer: Georgios Pinitas <georgios.pinitas@arm.com> 2021-05-18 14:48:39 +0000
commit: 856f66e6c61b77d03f754cd0fa8439891f0e4aca (patch)
tree: f9379cd0853ac407109e54c3d53b385ceee066c2 /docs
parent: 37f4b2ef1ea225a90ccb563fcb2c08f8fb0fb5d5 (diff)
download: ComputeLibrary-856f66e6c61b77d03f754cd0fa8439891f0e4aca.tar.gz
3 files changed, 17 insertions, 17 deletions
diff --git a/docs/ComputeLibrary.dir b/docs/ComputeLibrary.dir
index 74ac9d9d23..e08f05eb2d 100644
--- a/docs/ComputeLibrary.dir
+++ b/docs/ComputeLibrary.dir
@@ -230,7 +230,7 @@
  *  @brief Scalar operations
  */
 
-/** @dir src/core/CL/gemm
+/** @dir src/core/gpu/cl/kernels/gemm
  *  @brief Folder containing all the configuration files for GEMM
  */
 
diff --git a/docs/user_guide/introduction.dox b/docs/user_guide/introduction.dox
index 25274958ba..6b10b9c2a2 100644
--- a/docs/user_guide/introduction.dox
+++ b/docs/user_guide/introduction.dox
@@ -99,7 +99,7 @@ This archive contains:
  - The latest Khronos EGL 1.5 C headers from the <a href="https://www.khronos.org/registry/gles/">Khronos EGL registry</a>
  - The sources for a stub version of libOpenCL.so, libGLESv1_CM.so, libGLESv2.so and libEGL.so to help you build your application.
  - An examples folder containing a few examples to compile and link against the library.
- - A @ref utils folder containing headers with some boiler plate code used by the examples.
+ - A utils folder containing headers with some boiler plate code used by the examples.
  - This documentation.
 
  For detailed information about file organization, please refer to Files -> File List section of this documentation.
diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox
index 0f4d4cc0d5..a975e8b35e 100644
--- a/docs/user_guide/release_version_and_change_log.dox
+++ b/docs/user_guide/release_version_and_change_log.dox
@@ -280,7 +280,7 @@ v20.11 Public major release
    - CLLogits1DMaxShiftExpSumKernel
    - CLLogits1DNormKernel
    - CLHeightConcatenateLayerKernel
-   - @ref CLGEMMMatrixMultiplyKernel
+   - CLGEMMMatrixMultiplyKernel
    - @ref CLGEMMLowpQuantizeDownInt32ScaleKernel
    - @ref CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
    - @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
@@ -567,14 +567,14 @@ v20.08 Public major release
       The default "axis" value for @ref NESoftmaxLayer, @ref NELogSoftmaxLayer is changed from 1 to 0.
       Only axis 0 is supported.
  - The support for quantized data types has been removed from @ref CLLogSoftmaxLayer due to implementation complexity.
- - Removed padding requirement for the input (e.g. LHS of GEMM) and output in @ref CLGEMMMatrixMultiplyNativeKernel, @ref CLGEMMMatrixMultiplyReshapedKernel, @ref CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and @ref CLIm2ColKernel (NHWC only)
+ - Removed padding requirement for the input (e.g. LHS of GEMM) and output in CLGEMMMatrixMultiplyNativeKernel, CLGEMMMatrixMultiplyReshapedKernel, CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and @ref CLIm2ColKernel (NHWC only)
    - This change allows to use @ref CLGEMMConvolutionLayer without extra padding for the input and output.
    - Only the weights/bias of @ref CLGEMMConvolutionLayer could require padding for the computation.
-   - Only on Arm® Mali™ Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since @ref CLGEMMMatrixMultiplyKernel is called and currently requires padding.
- - Added support for exporting the OpenCL buffer object to the OpenCL image object in @ref CLGEMMMatrixMultiplyReshapedKernel and @ref CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.
+   - Only on Arm® Mali™ Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since CLGEMMMatrixMultiplyKernel is called and currently requires padding.
+ - Added support for exporting the OpenCL buffer object to the OpenCL image object in CLGEMMMatrixMultiplyReshapedKernel and CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.
    - This support allows to export the OpenCL buffer used for the reshaped RHS matrix to the OpenCL image object.
-   - The padding requirement for the OpenCL image object is considered into the @ref CLGEMMReshapeRHSMatrixKernel.
-   - The reshaped RHS matrix stores the weights when GEMM is used to accelerate @ref CLGEMMConvolutionLayer.
+   - The padding requirement for the OpenCL image object is considered into the CLGEMMReshapeRHSMatrixKernel.
+   - The reshaped RHS matrix stores the weights when GEMM is used to accelerate CLGEMMConvolutionLayer.
 
 v20.05 Public major release
  - Various bug fixes.
@@ -739,7 +739,7 @@ v19.11 Public major release
  - Added QASYMM16 support for:
     - @ref CLBoundingBoxTransform
  - Added FP16 support for:
-    - @ref CLGEMMMatrixMultiplyReshapedKernel
+    - CLGEMMMatrixMultiplyReshapedKernel
  - Added new data type QASYMM8_PER_CHANNEL support for:
     - CLDequantizationLayer
     - @ref NEDequantizationLayer
@@ -749,7 +749,7 @@ v19.11 Public major release
     - @ref CLDepthwiseConvolutionLayer
     - @ref NEDepthwiseConvolutionLayer
  - Added FP16 mixed-precision support for:
-    - @ref CLGEMMMatrixMultiplyReshapedKernel
+    - CLGEMMMatrixMultiplyReshapedKernel
     - CLPoolingLayerKernel
  - Added FP32 and FP16 ELU activation for:
     - @ref CLActivationLayer
@@ -813,9 +813,9 @@ v19.08 Public major release
     - @ref CLSinLayer
     - CLBatchConcatenateLayerKernel
     - @ref CLDepthToSpaceLayerKernel / @ref CLDepthToSpaceLayer
-    - @ref CLGEMMLowpMatrixMultiplyNativeKernel
+    - CLGEMMLowpMatrixMultiplyNativeKernel
     - CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
-    - @ref CLGEMMMatrixMultiplyNativeKernel
+    - CLGEMMMatrixMultiplyNativeKernel
     - CLMeanStdDevNormalizationKernel /CLMeanStdDevNormalizationLayer
     - @ref CLSpaceToDepthLayerKernel / @ref CLSpaceToDepthLayer
  - New examples:
@@ -862,7 +862,7 @@ v19.05 Public major release
     - @ref CLFFTRadixStageKernel
     - @ref CLFFTScaleKernel
     - @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
-    - @ref CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
+    - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
     - CLHeightConcatenateLayerKernel
     - @ref CLDirectDeconvolutionLayer
     - @ref CLFFT1D
@@ -947,9 +947,9 @@ v19.02 Public major release
     - @ref CLRsqrtLayer
     - @ref CLExpLayer
     - CLElementWiseUnaryLayerKernel
-    - @ref CLGEMMReshapeLHSMatrixKernel
-    - @ref CLGEMMReshapeRHSMatrixKernel
-    - @ref CLGEMMMatrixMultiplyReshapedKernel
+    - CLGEMMReshapeLHSMatrixKernel
+    - CLGEMMReshapeRHSMatrixKernel
+    - CLGEMMMatrixMultiplyReshapedKernel
     - @ref CLRangeKernel / @ref CLRange
     - @ref CLUnstack
     - @ref CLGatherKernel / @ref CLGather
@@ -1369,7 +1369,7 @@ v17.03.1 First Major public release of the sources
 v17.03 Sources preview
  - New OpenCL kernels / functions:
    - CLGradientKernel, CLEdgeNonMaxSuppressionKernel, CLEdgeTraceKernel / CLCannyEdge
-   - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, @ref CLGEMMMatrixMultiplyKernel, CLGEMMMatrixAdditionKernel / @ref CLGEMM
+   - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, CLGEMMMatrixMultiplyKernel, CLGEMMMatrixAdditionKernel / @ref CLGEMM
    - CLGEMMMatrixAccumulateBiasesKernel / @ref CLFullyConnectedLayer
    - CLTransposeKernel / @ref CLTranspose
    - CLLKTrackerInitKernel, CLLKTrackerStage0Kernel, CLLKTrackerStage1Kernel, CLLKTrackerFinalizeKernel / CLOpticalFlow
author	Georgios Pinitas <georgios.pinitas@arm.com>	2021-04-22 21:13:21 +0100
committer	Georgios Pinitas <georgios.pinitas@arm.com>	2021-05-18 14:48:39 +0000
commit	856f66e6c61b77d03f754cd0fa8439891f0e4aca (patch)
tree	f9379cd0853ac407109e54c3d53b385ceee066c2 /docs
parent	37f4b2ef1ea225a90ccb563fcb2c08f8fb0fb5d5 (diff)
download	ComputeLibrary-856f66e6c61b77d03f754cd0fa8439891f0e4aca.tar.gz