diff options
author | Anthony Barbier <anthony.barbier@arm.com> | 2017-10-27 15:01:44 +0100 |
---|---|---|
committer | Anthony Barbier <anthony.barbier@arm.com> | 2018-11-02 16:35:24 +0000 |
commit | e500747b5c1d27aeffae316c8190f6d169bb2fbd (patch) | |
tree | f26f748a92d1852b3280f492b0a26a980313f29f /src/core/NEON | |
parent | 16cdf89eec95986d1b386312ccf3b221f6a1bad4 (diff) | |
download | ComputeLibrary-e500747b5c1d27aeffae316c8190f6d169bb2fbd.tar.gz |
COMPMID-556: Cherry-picked minor fixes from Github
- Added --api 21 to documentation
- Removed include of a runtime header in Core
- cherry-picked 2 small fixes from Github
commit 869d424d6fd5df7b15a858f2c5f853536f7a0aca
Author: giorgio-arena <arena.cpp@gmail.com>
Date: Mon Oct 23 16:58:59 2017 +0100
Update 00_introduction.dox
commit f054c210e493111061a458b887f7c4edaca06a9f
Author: Forrest Iandola <fiandola@gmail.com>
Date: Mon Oct 16 00:44:24 2017 -0700
fix comment
typo
kernerl's --> kernel's
Change-Id: I0a3893148a9565acbfd18d340da41845ce3ad44f
Reviewed-on: http://mpd-gerrit.cambridge.arm.com/93460
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
Diffstat (limited to 'src/core/NEON')
-rw-r--r-- | src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp | 2 | ||||
-rw-r--r-- | src/core/NEON/kernels/NEPixelWiseMultiplicationKernel.cpp | 1 |
2 files changed, 1 insertions, 2 deletions
diff --git a/src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp b/src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp index 8642a19f39..60a3a1b636 100644 --- a/src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp +++ b/src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp @@ -1082,7 +1082,7 @@ public: the third thread [16,24] and the fourth thread [25,31]. The algorithm outer loop iterates over Z, P, Y, X where P is the depth/3rd dimension of each kernel. This order is not arbitrary, the main benefit of this - is that we setup the neon registers containing the kernerl's values only once and then compute each XY using the preloaded registers as opposed as doing this for every XY value. + is that we setup the neon registers containing the kernel's values only once and then compute each XY using the preloaded registers as opposed as doing this for every XY value. The algorithm does not require allocating any additional memory amd computes the results directly in-place in two stages: 1) Convolve plane 0 with kernel 0 and initialize the corresponding output plane with these values. diff --git a/src/core/NEON/kernels/NEPixelWiseMultiplicationKernel.cpp b/src/core/NEON/kernels/NEPixelWiseMultiplicationKernel.cpp index 2c90d9aa22..c622d4ffc2 100644 --- a/src/core/NEON/kernels/NEPixelWiseMultiplicationKernel.cpp +++ b/src/core/NEON/kernels/NEPixelWiseMultiplicationKernel.cpp @@ -30,7 +30,6 @@ #include "arm_compute/core/NEON/NEFixedPoint.h" #include "arm_compute/core/TensorInfo.h" #include "arm_compute/core/Validate.h" -#include "arm_compute/runtime/NEON/functions/NEPixelWiseMultiplication.h" #include <arm_neon.h> #include <climits> |