aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorGeorgios Pinitas <georgios.pinitas@arm.com>2021-04-22 21:13:21 +0100
committerGeorgios Pinitas <georgios.pinitas@arm.com>2021-05-18 14:48:39 +0000
commit856f66e6c61b77d03f754cd0fa8439891f0e4aca (patch)
treef9379cd0853ac407109e54c3d53b385ceee066c2 /docs
parent37f4b2ef1ea225a90ccb563fcb2c08f8fb0fb5d5 (diff)
downloadComputeLibrary-856f66e6c61b77d03f754cd0fa8439891f0e4aca.tar.gz
Port CLGEMM to memory injecting interface
Moves the following kernels: - CLGEMMMatrixMultiplyKernel - CLGEMMMatrixMultiplyNativeKernel - CLGEMMMatrixMultipluReshapedKernel - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel Moves the following functions - CLGEMM Introduces facilities to easy handling of auxiliary temporary buffers under then new run interface. Such are: - CLAuxTensorHandler: That allows wrapping of workspace buffers memory to CLBuffer objects - Ability to inject TensorInfo to allocator without transferring ownership. This reduce the copy overhead if needed. Resolves: COMPMID-4188 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I7055435d831b05b749b26302082e4ac45f26dfb0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5498 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'docs')
-rw-r--r--docs/ComputeLibrary.dir2
-rw-r--r--docs/user_guide/introduction.dox2
-rw-r--r--docs/user_guide/release_version_and_change_log.dox30
3 files changed, 17 insertions, 17 deletions
diff --git a/docs/ComputeLibrary.dir b/docs/ComputeLibrary.dir
index 74ac9d9d23..e08f05eb2d 100644
--- a/docs/ComputeLibrary.dir
+++ b/docs/ComputeLibrary.dir
@@ -230,7 +230,7 @@
* @brief Scalar operations
*/
-/** @dir src/core/CL/gemm
+/** @dir src/core/gpu/cl/kernels/gemm
* @brief Folder containing all the configuration files for GEMM
*/
diff --git a/docs/user_guide/introduction.dox b/docs/user_guide/introduction.dox
index 25274958ba..6b10b9c2a2 100644
--- a/docs/user_guide/introduction.dox
+++ b/docs/user_guide/introduction.dox
@@ -99,7 +99,7 @@ This archive contains:
- The latest Khronos EGL 1.5 C headers from the <a href="https://www.khronos.org/registry/gles/">Khronos EGL registry</a>
- The sources for a stub version of libOpenCL.so, libGLESv1_CM.so, libGLESv2.so and libEGL.so to help you build your application.
- An examples folder containing a few examples to compile and link against the library.
- - A @ref utils folder containing headers with some boiler plate code used by the examples.
+ - A utils folder containing headers with some boiler plate code used by the examples.
- This documentation.
For detailed information about file organization, please refer to Files -> File List section of this documentation.
diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox
index 0f4d4cc0d5..a975e8b35e 100644
--- a/docs/user_guide/release_version_and_change_log.dox
+++ b/docs/user_guide/release_version_and_change_log.dox
@@ -280,7 +280,7 @@ v20.11 Public major release
- CLLogits1DMaxShiftExpSumKernel
- CLLogits1DNormKernel
- CLHeightConcatenateLayerKernel
- - @ref CLGEMMMatrixMultiplyKernel
+ - CLGEMMMatrixMultiplyKernel
- @ref CLGEMMLowpQuantizeDownInt32ScaleKernel
- @ref CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
- @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
@@ -567,14 +567,14 @@ v20.08 Public major release
The default "axis" value for @ref NESoftmaxLayer, @ref NELogSoftmaxLayer is changed from 1 to 0.
Only axis 0 is supported.
- The support for quantized data types has been removed from @ref CLLogSoftmaxLayer due to implementation complexity.
- - Removed padding requirement for the input (e.g. LHS of GEMM) and output in @ref CLGEMMMatrixMultiplyNativeKernel, @ref CLGEMMMatrixMultiplyReshapedKernel, @ref CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and @ref CLIm2ColKernel (NHWC only)
+ - Removed padding requirement for the input (e.g. LHS of GEMM) and output in CLGEMMMatrixMultiplyNativeKernel, CLGEMMMatrixMultiplyReshapedKernel, CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and @ref CLIm2ColKernel (NHWC only)
- This change allows to use @ref CLGEMMConvolutionLayer without extra padding for the input and output.
- Only the weights/bias of @ref CLGEMMConvolutionLayer could require padding for the computation.
- - Only on Arm® Mali™ Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since @ref CLGEMMMatrixMultiplyKernel is called and currently requires padding.
- - Added support for exporting the OpenCL buffer object to the OpenCL image object in @ref CLGEMMMatrixMultiplyReshapedKernel and @ref CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.
+ - Only on Arm® Mali™ Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since CLGEMMMatrixMultiplyKernel is called and currently requires padding.
+ - Added support for exporting the OpenCL buffer object to the OpenCL image object in CLGEMMMatrixMultiplyReshapedKernel and CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.
- This support allows to export the OpenCL buffer used for the reshaped RHS matrix to the OpenCL image object.
- - The padding requirement for the OpenCL image object is considered into the @ref CLGEMMReshapeRHSMatrixKernel.
- - The reshaped RHS matrix stores the weights when GEMM is used to accelerate @ref CLGEMMConvolutionLayer.
+ - The padding requirement for the OpenCL image object is considered into the CLGEMMReshapeRHSMatrixKernel.
+ - The reshaped RHS matrix stores the weights when GEMM is used to accelerate CLGEMMConvolutionLayer.
v20.05 Public major release
- Various bug fixes.
@@ -739,7 +739,7 @@ v19.11 Public major release
- Added QASYMM16 support for:
- @ref CLBoundingBoxTransform
- Added FP16 support for:
- - @ref CLGEMMMatrixMultiplyReshapedKernel
+ - CLGEMMMatrixMultiplyReshapedKernel
- Added new data type QASYMM8_PER_CHANNEL support for:
- CLDequantizationLayer
- @ref NEDequantizationLayer
@@ -749,7 +749,7 @@ v19.11 Public major release
- @ref CLDepthwiseConvolutionLayer
- @ref NEDepthwiseConvolutionLayer
- Added FP16 mixed-precision support for:
- - @ref CLGEMMMatrixMultiplyReshapedKernel
+ - CLGEMMMatrixMultiplyReshapedKernel
- CLPoolingLayerKernel
- Added FP32 and FP16 ELU activation for:
- @ref CLActivationLayer
@@ -813,9 +813,9 @@ v19.08 Public major release
- @ref CLSinLayer
- CLBatchConcatenateLayerKernel
- @ref CLDepthToSpaceLayerKernel / @ref CLDepthToSpaceLayer
- - @ref CLGEMMLowpMatrixMultiplyNativeKernel
+ - CLGEMMLowpMatrixMultiplyNativeKernel
- CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
- - @ref CLGEMMMatrixMultiplyNativeKernel
+ - CLGEMMMatrixMultiplyNativeKernel
- CLMeanStdDevNormalizationKernel /CLMeanStdDevNormalizationLayer
- @ref CLSpaceToDepthLayerKernel / @ref CLSpaceToDepthLayer
- New examples:
@@ -862,7 +862,7 @@ v19.05 Public major release
- @ref CLFFTRadixStageKernel
- @ref CLFFTScaleKernel
- @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
- - @ref CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
+ - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
- CLHeightConcatenateLayerKernel
- @ref CLDirectDeconvolutionLayer
- @ref CLFFT1D
@@ -947,9 +947,9 @@ v19.02 Public major release
- @ref CLRsqrtLayer
- @ref CLExpLayer
- CLElementWiseUnaryLayerKernel
- - @ref CLGEMMReshapeLHSMatrixKernel
- - @ref CLGEMMReshapeRHSMatrixKernel
- - @ref CLGEMMMatrixMultiplyReshapedKernel
+ - CLGEMMReshapeLHSMatrixKernel
+ - CLGEMMReshapeRHSMatrixKernel
+ - CLGEMMMatrixMultiplyReshapedKernel
- @ref CLRangeKernel / @ref CLRange
- @ref CLUnstack
- @ref CLGatherKernel / @ref CLGather
@@ -1369,7 +1369,7 @@ v17.03.1 First Major public release of the sources
v17.03 Sources preview
- New OpenCL kernels / functions:
- CLGradientKernel, CLEdgeNonMaxSuppressionKernel, CLEdgeTraceKernel / CLCannyEdge
- - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, @ref CLGEMMMatrixMultiplyKernel, CLGEMMMatrixAdditionKernel / @ref CLGEMM
+ - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, CLGEMMMatrixMultiplyKernel, CLGEMMMatrixAdditionKernel / @ref CLGEMM
- CLGEMMMatrixAccumulateBiasesKernel / @ref CLFullyConnectedLayer
- CLTransposeKernel / @ref CLTranspose
- CLLKTrackerInitKernel, CLLKTrackerStage0Kernel, CLLKTrackerStage1Kernel, CLLKTrackerFinalizeKernel / CLOpticalFlow