diff options
author | Michalis Spyrou <michalis.spyrou@arm.com> | 2021-06-07 14:23:57 +0100 |
---|---|---|
committer | Georgios Pinitas <georgios.pinitas@arm.com> | 2021-06-23 12:25:50 +0000 |
commit | 20fca524baf99402f742ce38c538f2fd07d5fff9 (patch) | |
tree | b63d98383d1ba22bb3ca59d393e4ab9d47a9c762 /src/core/NEON/kernels/arm_gemm/mergeresults.cpp | |
parent | 1d359279e22874121def2ce4bfdb633d94ea5ade (diff) | |
download | ComputeLibrary-20fca524baf99402f742ce38c538f2fd07d5fff9.tar.gz |
Create core library using high priority operators
A smaller core library is created using a subset of the operators.
Changed the structure of filelist.json in order to include more
information about the kernels and make the selection easier.
Resolves: COMPMID-4514
Change-Id: I079ca7d8e64346174eebdd13b834e1dd4dc36ca2
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5786
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'src/core/NEON/kernels/arm_gemm/mergeresults.cpp')
-rw-r--r-- | src/core/NEON/kernels/arm_gemm/mergeresults.cpp | 6 |
1 files changed, 5 insertions, 1 deletions
diff --git a/src/core/NEON/kernels/arm_gemm/mergeresults.cpp b/src/core/NEON/kernels/arm_gemm/mergeresults.cpp index 17566db375..bbfe8f23d9 100644 --- a/src/core/NEON/kernels/arm_gemm/mergeresults.cpp +++ b/src/core/NEON/kernels/arm_gemm/mergeresults.cpp @@ -37,9 +37,13 @@ namespace arm_gemm { template<unsigned int twidth, unsigned int height, bool sve=false, typename Tin, typename Tout> void MergeResults(Tout * out, const Tin * in, int ldc, int y0, int ymax, int x0, int xmax, const Tout *bias, Activation act, bool append) { + // NOTE: The following code is disabled to avoid calling get_vector_length(), so templated MergeResults will not + // be correct for SVE cases. This is OK as we have specialisations for all needed SVE cases anyway. + // // For SVE cases, multiply the width up by the vector length. // Use the *input* type to determine this, since this will be what the kernel operated on. - const int width = twidth * (sve ? get_vector_length<Tin>() : 1); + // const int width = twidth * (sve ? get_vector_length<Tin>() : 1); + const int width = twidth; const int full_y_blocks = (ymax - y0) / height; const int y_remainder = (ymax - y0) % height; |