aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1188 - Fix CLWinogradConvolutionLayer for NHWCGian Marco Iodice
Change-Id: Ib4abe0388f218276e79f7c4405827e61722f0ef8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144240 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1498 - Enable grouping in CLGEMMConvolutionLayerGian Marco Iodice
Change-Id: I15c7df21773145b03f42b6f78bd7ad2e5b8a5219 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144126 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1376: Add support for QASYMM8 in CLDeconvolutionLayerMichele Di Giorgio
Change-Id: I13ec79b6668e2b9559d3fa789ae0b51ab6975289 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139126 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1504: (Nightly) Segfaults on CL and androidGeorgios Pinitas
Keeps a copy of context in Scheduler to avoid releasing KernelLibrary resources before Scheduler resourses leading to a segfault. Does not exactly revert COMPMID-1122 as it still tries to keep context in sync. Change-Id: I3deb6bc1725b80f65f51ebd34d536f612ef6dd86 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144024 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1499: Fixed issues to build for FP16 on AndroidAnthony Barbier
Change-Id: I7cd15e9115b5c6f544005528d69061751286be11 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143708 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1246 Remove unused window iterator from NERNNLayer.Michalis Spyrou
Change-Id: Ia1ab755f85adb602c115f20e384fb459d3f91927 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143894 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188: Assign correct ticket to TODO in NEDerivativeKernelMichele Di Giorgio
Change-Id: I57bbfb79090fd57c57fdedd24a26736b272ea2f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143893 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1478: Fixed Doxygen comments + minor fixesAnthony Barbier
- Allow check_bad_style.sh to only run on some of the files - Pass missing lws_hint() in CLNormalizationLayerKernel Change-Id: I2cf44f82f7ba6c8dc8d40691aeec7c6c3de385b5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143628 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1473: Added missing TypePrinter for CPUModel, added accessor for ↵Anthony Barbier
number of CPUs Change-Id: If81d58b83143129bed91b9c6658b0cd4e623bc38 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143664 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1485 - Add support for NHWC when running NEGEMMConvolutionLayer with ↵Gian Marco Iodice
FP16/QASYMM8 When the GEMM3D check fails, now we fallback to the classic implementation with im2col and col2im. In this manner the function can work with QASYMM8 and FP16 Change-Id: I359e9da3a63956f33b5acbc9bca4383b14af10e2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143372 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1343: Add grouping support to CLCol2ImKernelMichele Di Giorgio
Change-Id: I5188a2163e7341f1915d98c21464fea13a9a7faf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143330 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1342 Add grouping support to CLIm2ColKernelGiorgio Arena
Change-Id: I4afb19751520a90fee27fb49b775cd10e92a94f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140476 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1478: Stop relying on static default OpenCL objects in cl2.hppAnthony Barbier
This causes problems when ACL is used as a shared library on Android. Fixes some problems related to creation / destruction order between the Graph's CL backend and core / runtime Change-Id: I716d63fd42f4586df1ffbb6fa97e4db06d3a781b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143228 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1188 - Passed WIDTH_OFFSET at compile time in ↵Gian Marco Iodice
CLWidthDepthConcatenateLayerKernel Change-Id: Icab813cd432174608621ee6a87015aeb10ab822d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143570 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1410 (Nightly) Mismatches in CLMeanStd function for floatMichalis Spyrou
Change CLReductionOperation border to be multiple of 64 instead of 16. The opencl kernel works only with local_size(0) being a power of 2. This will generate a padding of 63 if input_width % 64 = 1, but I don't think it's a big issue and it keeps the border calculation pretty simple. Also, increased tolerance for fp32 because there were mismatches for the 4K image. Change-Id: Id44990a262b2d6eff4c8ce56eb7c886274d9847e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143415 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1486 - CLGEMMDilatedConvolutionLayer FP16 / FP32 failing in nightliesGian Marco Iodice
Wrong boundary condition in the im2col3x3_nhwc kernel Change-Id: I83e9dd9b425fd0e3227decb1da3d08a3f5e2536d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143489 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02MLCE-13: Sanitizing matrix argument in the Warp.Pablo Tello
This changes help to prevent errors like passing a matrix with less elements than required into the warp functions. Change-Id: I863f933a5e0568258717cffed3a20788d3d03083 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143044 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1488 - Add support for NHWC when running CLGEMMConvolutionLayer with ↵Gian Marco Iodice
QASYMM8 Fixed also a bug in the graph API related to the bias shape in DepthWiseConvolution for NHWC Change-Id: I275141a42e51f6747b77db1c31d1bc69e8685af5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143454 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1303: CLDepthConvert : Add support for FP32 -> FP16 and FP16 -> FP32 ↵Michele Di Giorgio
+ validate() function Change-Id: I6808de0254a7c4bca440322cc14b795b3b32465b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142427 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1188 - Removed the multiplication by 4 in NEGEMMInterleavedWrapperGian Marco Iodice
Change-Id: Iaf8519bc483b947876a9b6ba83b4eb43b45b83a1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143135 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1277 - Optimizing CLIm2ColKernel for NHWC.Gian Marco Iodice
This patch includes: - Im2Col optimizations for NHWC using a new data layout - Refactoring of CLIm2ColKernel adding validation method and auto-init - Removed im2col_reduced from CLIm2ColKernel and created a new kernel CLFlattenLayerKernel Change-Id: I1620640b6796baa268324b33ae92cdd8de53e27c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141241 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1344 Add grouping support to CLWeightsReshapeKernelGiorgio Arena
Change-Id: Idde333308db71087ec234b3fd1eb4e36a44db46c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143049 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1481: CLCannyEdge still failing in some precommitsMichele Di Giorgio
Without the check introduced by this patch, all weak edges as marked as strong edges. Change-Id: I874ebf22c06707bd98bd11b9be93602bfcbafa7c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142922 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1188 - Fixed performance degradation with GEMM3DGian Marco Iodice
The previous implementation of GEMM3D degradated the performance when the input had to be reinterpreted as 3D. However if both input and output have to be reinterpreted as 3D, we can skip the offset calculation for that specific case and run the multi GEMM approach Change-Id: I0d5d48add2c6ccdebfbb268ea199dd181101f3aa Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142872 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1248 Enabled memory manager in NEWinogradConvolutionLayerAnthony Barbier
Change-Id: I7bbab53f18a42f0879d80122a52bb6bdca4b8631 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142413 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1406: Refactor gemm_interleaved to use our own types and schedulerAnthony Barbier
- Ported PrepareB kernel from gemm_interleave - Ported TransformA feature from gemm_interleave - Allocate reshaped a and b buffers - Added memory_manager / memory_group - MatrixMultiply kernel - Interleave kernels execution. - Fixed a few bugs: all nightly Convolution tests passing for threads=1 and threads=4 - Added Doxygen documentations and comments in the code - Added support for all data types supported Change-Id: Iffa1c09fda0bb9c61213bb83524d5a48e7ecb03c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141281 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-872 - Rework NEGEMMConvolutionLayer to use NEGEMMGian Marco Iodice
Change-Id: I55f0018ac7214775ebbca63f58a3bf5c93732fec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1475: (OCLGrind) FP exception in NEGEMMConvolutionGeorgios Pinitas
Change-Id: I986099c269498cc7971b10ee634dba721954546e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142647 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02MLCE-36: FC tranpose weightsGeorgios Pinitas
Change-Id: I3b8a6c00e61ba6da459ca5fc7275393f9d073aed Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142533 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02MLCE-37: Adds PermuteNode support in graphGeorgios Pinitas
Change-Id: Iaa93a497e7913c27f2fd09e974125cda5f04bc4b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142463 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188 - Fixed CLGEMMConvolutionLayer/NEGEMMConvolutionLayer for NHWCGian Marco Iodice
We skipped im2col also without unit strides Change-Id: I04c63a6dda8553b3890e832a56ff6854349c829a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142520 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1439: Fix invalidation of iterator while iteratingGeorgios Pinitas
Change-Id: I0d253e6047216cfbd57dc807881c2b24d82c47f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142357 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1445: Fixed NEON GEMM assembly dispatch for QASYMM8Anthony Barbier
Change-Id: I2d20cd3c5f83a9ba4e0de6659b255337877d5bbc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142252 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1276 - Allow GEMM to work with 3D input tensorGian Marco Iodice
Skipped im2col in CLGEMMConvolutionLayer for 1x1 convolutions with NHWC data layout Change-Id: I894e6b952ed8605e8f3ffc0ffc25c24730d4664c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141909 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1437: (Nightly) OCLGrind failures in CLDepthwiseConvolution QA8 nhwcGeorgios Pinitas
Change-Id: I2c1e69b4654e928d8e7e9071258194f258bb6935 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142368 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1359: (Nightly) CLCannyEdge failuresMichele Di Giorgio
Change-Id: I0fa02b8cc9289cfc4c89bea3f2041db938204948 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142232 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188: Expose FullyConnectedLayer info metadata at graph levelGeorgios Pinitas
Change-Id: I7670f79209a1e4439d57e05c1f5c576f600971cb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142299 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1188 - Fixed get_convolution_method in CLConvolutionLayerGian Marco Iodice
Change-Id: I0c155d0d8a56fc6610dc2476e669456c7d2cc87b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142068 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1124: Validate CLLSTMGeorgios Pinitas
-Enables cell-to-input weights when !cifg and peephole -Makes projection bias conditional Change-Id: Iee866db9f5d8479c2dfd95d74a2d42492bf07a8d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140543 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Les Bell <les.bell@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1290 - Update CLTuner documentationGian Marco Iodice
Change-Id: Ief1b6df40623c9f304093cf1f188c86454da3f9c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141965 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1440: Access original B in gemm assembly when not pretransposed.Georgios Pinitas
Change-Id: I5f2c198f7ac4d8996180e204e763ab53f5e7ea3d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142153 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Matteo Martincigh <matteo.martincigh@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1188: Add quantization info support in graph FC layer.Georgios Pinitas
Change-Id: Ie9a6a896da142198243139fb9f8be0f83b87ccce Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142130 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1434: Fix NEWinograd for NHWC and sub-tensorsGeorgios Pinitas
Apply offsets and strides to winograd transform functions in NEON. Change-Id: Ia4f44d22244203a5f9d93d2fed73570396b0d28c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141803 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1431 Use either arm_dot or arm_dot_acc for CLGEMMLowp based on what ↵Giorgio Arena
is supported Change-Id: I4c5121e0f000d5ee94a8c8c5326272806f643e35 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141520 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1429: NEGEMMNativeWrapperKernel fails when ran for WinogradGeorgios Pinitas
Change-Id: Ief9b717fe2bcf626660109ec491f8882d0ef06d7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141658 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1367: Enable NHWC in graph examplesGeorgios Pinitas
Change-Id: Iabc54a3a1bdcd46a9a921cda39c7c85fef672b72 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141449 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1316 Using 8 bit dot product instruction in CLDepthWiseConvolution ↵Giorgio Arena
with QASYMM8 Change-Id: I3fc37bdceaae8b4b1effa51129b71bf352388564 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138374 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1188: Remove FP16 capabilities check form NEReshapeLayer.Georgios Pinitas
Change-Id: I2dde22f70b5aa27be983cf6b6ee1d1926653aa99 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141510 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1386: Add FC convert weights on NEONGeorgios Pinitas
Change-Id: I7a3c6db9285e3899494f496b2562d80cec1b6521 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141407 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1401 Implement NEFullyConnectedLayer for QASYMM8Giorgio Arena
Change-Id: I0404df6d369855e2f458f2db8f26e81c80a1ee87 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140148 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>