aboutsummaryrefslogtreecommitdiff
path: root/src/core
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1533 Implementing CLDepthWiseConvolutionLayer with FP16 (NHWC)Giorgio Arena
Change-Id: I46965aeb1fffba8cbf083cab7284c549b0e94d00 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145334 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1529 Optimize PoolingLayer NHWC on NEONGiorgio Arena
Change-Id: Ib85e5cc203d6c71f83c6021c776ccdc0eef82acf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145165 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1494 Optimise NEON im2col and weights reshape for NHWCGiorgio Arena
Change-Id: I99ebae61024a7bce9d17292a02c28626ae6c29d5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144872 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1246 Fix NEON mobilenet NCHWGiorgio Arena
Change-Id: I1dd6df9bd4a96cb7cbacce939a89c3a7ccee71c8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145397 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1534 - Fix GEMM and Magnitude test for FP16Gian Marco Iodice
On GEMM we had accuracy issue On Magnitude we have disabled the fp16 acceleration since we do not have feature parity with CL and this function is not used for ML Change-Id: Iaebe3bbbd2a9f45db0c714aa5ebaf48eb0b65741 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145467 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1536: Github PR: Removed redundant const qualifier on castKohei Takahashi
GCC (>=8) yields warning w/ -Wignored-qualifers (enabled by -Wextra) on such usage. Change-Id: Ib3284b60cec0ec4faf8c6e6c1e2980cbf5731973 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145384 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1534 - Fix NENormalizationLayer for FP16Gian Marco Iodice
Implemented vinvq_f16 with fp32 data type in order to avoid accuracy issue. Change-Id: Ibfffd12e4a941c1388a982fc7bbe3e1832351feb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145416 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1534: Fix 2x2 NEPoolingLayer for FP16Georgios Pinitas
Change-Id: Icaf45cad826bb0966a6c663ecb7e828f5fe5e5db Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145336 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1534 - Fixing FP16 tests on NEONGian Marco Iodice
- Fixed GEMMConvolutionLayer test. The issue was related to the tolerance - Fixed DirectConvolutioNLayer test. The issue was in the convolver_3x3 Change-Id: I9d5b906d7e5e32a0a34300d529d6edb804ac1c4e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145377 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1534: Fix NESoftmaxLayer for FP16Georgios Pinitas
Simulates exp function in FP32 Change-Id: Ieffceeab64fda6f466f212b56f794cc44d477afa Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145367 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1534: Fix NEActivationLayer for FP16Georgios Pinitas
Simulates Logistic, Tanh and SoftRelu in FP32 Change-Id: I9950f7636b8ff2f3e054937e5ef414e45dfe06f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145357 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1188: Add support for activation in NEBatchNormalization.Georgios Pinitas
Change-Id: I1e206574dac6433218db6e138adb7bf5f66a536d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145222 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1047 Extract Flatten function from Im2Col for NEONGiorgio Arena
Change-Id: I80f3aaadc8cae8c9ca1a5a239e79bda302b89bd8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144813 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1188: Static tuning of CLScaleGeorgios Pinitas
Change-Id: Icf1cc00d9861fdb8766d0b8fd33ca90833863927 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144830 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188: Fix subtensor checkGeorgios Pinitas
Change-Id: Id8366a1d828e2f1a729c70bac1fb232182d59c0c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144382 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1366 Implement NECopyMichalis Spyrou
Change-Id: I183e4b7081bf12de3546293a00da68b4f4a0dd5e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143987 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188 - Fix CLWinogradConvolutionLayer for NHWCGian Marco Iodice
Change-Id: Ib4abe0388f218276e79f7c4405827e61722f0ef8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144240 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1498 - Enable grouping in CLGEMMConvolutionLayerGian Marco Iodice
Change-Id: I15c7df21773145b03f42b6f78bd7ad2e5b8a5219 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144126 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1376: Add support for QASYMM8 in CLDeconvolutionLayerMichele Di Giorgio
Change-Id: I13ec79b6668e2b9559d3fa789ae0b51ab6975289 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139126 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1499: Fixed issues to build for FP16 on AndroidAnthony Barbier
Change-Id: I7cd15e9115b5c6f544005528d69061751286be11 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143708 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1188: Assign correct ticket to TODO in NEDerivativeKernelMichele Di Giorgio
Change-Id: I57bbfb79090fd57c57fdedd24a26736b272ea2f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143893 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1478: Fixed Doxygen comments + minor fixesAnthony Barbier
- Allow check_bad_style.sh to only run on some of the files - Pass missing lws_hint() in CLNormalizationLayerKernel Change-Id: I2cf44f82f7ba6c8dc8d40691aeec7c6c3de385b5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143628 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1473: Added missing TypePrinter for CPUModel, added accessor for ↵Anthony Barbier
number of CPUs Change-Id: If81d58b83143129bed91b9c6658b0cd4e623bc38 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143664 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1343: Add grouping support to CLCol2ImKernelMichele Di Giorgio
Change-Id: I5188a2163e7341f1915d98c21464fea13a9a7faf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143330 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1342 Add grouping support to CLIm2ColKernelGiorgio Arena
Change-Id: I4afb19751520a90fee27fb49b775cd10e92a94f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140476 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1478: Stop relying on static default OpenCL objects in cl2.hppAnthony Barbier
This causes problems when ACL is used as a shared library on Android. Fixes some problems related to creation / destruction order between the Graph's CL backend and core / runtime Change-Id: I716d63fd42f4586df1ffbb6fa97e4db06d3a781b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143228 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1188 - Passed WIDTH_OFFSET at compile time in ↵Gian Marco Iodice
CLWidthDepthConcatenateLayerKernel Change-Id: Icab813cd432174608621ee6a87015aeb10ab822d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143570 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1410 (Nightly) Mismatches in CLMeanStd function for floatMichalis Spyrou
Change CLReductionOperation border to be multiple of 64 instead of 16. The opencl kernel works only with local_size(0) being a power of 2. This will generate a padding of 63 if input_width % 64 = 1, but I don't think it's a big issue and it keeps the border calculation pretty simple. Also, increased tolerance for fp32 because there were mismatches for the 4K image. Change-Id: Id44990a262b2d6eff4c8ce56eb7c886274d9847e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143415 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1486 - CLGEMMDilatedConvolutionLayer FP16 / FP32 failing in nightliesGian Marco Iodice
Wrong boundary condition in the im2col3x3_nhwc kernel Change-Id: I83e9dd9b425fd0e3227decb1da3d08a3f5e2536d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143489 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02MLCE-13: Sanitizing matrix argument in the Warp.Pablo Tello
This changes help to prevent errors like passing a matrix with less elements than required into the warp functions. Change-Id: I863f933a5e0568258717cffed3a20788d3d03083 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143044 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1303: CLDepthConvert : Add support for FP32 -> FP16 and FP16 -> FP32 ↵Michele Di Giorgio
+ validate() function Change-Id: I6808de0254a7c4bca440322cc14b795b3b32465b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142427 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1277 - Optimizing CLIm2ColKernel for NHWC.Gian Marco Iodice
This patch includes: - Im2Col optimizations for NHWC using a new data layout - Refactoring of CLIm2ColKernel adding validation method and auto-init - Removed im2col_reduced from CLIm2ColKernel and created a new kernel CLFlattenLayerKernel Change-Id: I1620640b6796baa268324b33ae92cdd8de53e27c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141241 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1344 Add grouping support to CLWeightsReshapeKernelGiorgio Arena
Change-Id: Idde333308db71087ec234b3fd1eb4e36a44db46c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143049 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1481: CLCannyEdge still failing in some precommitsMichele Di Giorgio
Without the check introduced by this patch, all weak edges as marked as strong edges. Change-Id: I874ebf22c06707bd98bd11b9be93602bfcbafa7c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142922 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1188 - Fixed performance degradation with GEMM3DGian Marco Iodice
The previous implementation of GEMM3D degradated the performance when the input had to be reinterpreted as 3D. However if both input and output have to be reinterpreted as 3D, we can skip the offset calculation for that specific case and run the multi GEMM approach Change-Id: I0d5d48add2c6ccdebfbb268ea199dd181101f3aa Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142872 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1406: Refactor gemm_interleaved to use our own types and schedulerAnthony Barbier
- Ported PrepareB kernel from gemm_interleave - Ported TransformA feature from gemm_interleave - Allocate reshaped a and b buffers - Added memory_manager / memory_group - MatrixMultiply kernel - Interleave kernels execution. - Fixed a few bugs: all nightly Convolution tests passing for threads=1 and threads=4 - Added Doxygen documentations and comments in the code - Added support for all data types supported Change-Id: Iffa1c09fda0bb9c61213bb83524d5a48e7ecb03c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141281 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-872 - Rework NEGEMMConvolutionLayer to use NEGEMMGian Marco Iodice
Change-Id: I55f0018ac7214775ebbca63f58a3bf5c93732fec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1276 - Allow GEMM to work with 3D input tensorGian Marco Iodice
Skipped im2col in CLGEMMConvolutionLayer for 1x1 convolutions with NHWC data layout Change-Id: I894e6b952ed8605e8f3ffc0ffc25c24730d4664c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141909 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1437: (Nightly) OCLGrind failures in CLDepthwiseConvolution QA8 nhwcGeorgios Pinitas
Change-Id: I2c1e69b4654e928d8e7e9071258194f258bb6935 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142368 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1359: (Nightly) CLCannyEdge failuresMichele Di Giorgio
Change-Id: I0fa02b8cc9289cfc4c89bea3f2041db938204948 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142232 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1124: Validate CLLSTMGeorgios Pinitas
-Enables cell-to-input weights when !cifg and peephole -Makes projection bias conditional Change-Id: Iee866db9f5d8479c2dfd95d74a2d42492bf07a8d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140543 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Les Bell <les.bell@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1434: Fix NEWinograd for NHWC and sub-tensorsGeorgios Pinitas
Apply offsets and strides to winograd transform functions in NEON. Change-Id: Ia4f44d22244203a5f9d93d2fed73570396b0d28c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141803 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1431 Use either arm_dot or arm_dot_acc for CLGEMMLowp based on what ↵Giorgio Arena
is supported Change-Id: I4c5121e0f000d5ee94a8c8c5326272806f643e35 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141520 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1429: NEGEMMNativeWrapperKernel fails when ran for WinogradGeorgios Pinitas
Change-Id: Ief9b717fe2bcf626660109ec491f8882d0ef06d7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141658 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1367: Enable NHWC in graph examplesGeorgios Pinitas
Change-Id: Iabc54a3a1bdcd46a9a921cda39c7c85fef672b72 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141449 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1316 Using 8 bit dot product instruction in CLDepthWiseConvolution ↵Giorgio Arena
with QASYMM8 Change-Id: I3fc37bdceaae8b4b1effa51129b71bf352388564 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138374 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1188: Remove FP16 capabilities check form NEReshapeLayer.Georgios Pinitas
Change-Id: I2dde22f70b5aa27be983cf6b6ee1d1926653aa99 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141510 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1401 Implement NEFullyConnectedLayer for QASYMM8Giorgio Arena
Change-Id: I0404df6d369855e2f458f2db8f26e81c80a1ee87 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140148 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1419: Make NEGEMMAssemblyDispatch dynamically typed instead of templatedAnthony Barbier
This makes it easier to integrate in GEMMLowpMatrixMultiplyCore Change-Id: Ibf80803f016a2e6a24d943ffafb50b48f04ec545 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140868 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1405: Create our own gemm_native kernel / function.Anthony Barbier
Change-Id: Ie0a80bd6b4eb5632cac63ccf54bcb07d4309da19 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140305 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>