aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1505: Add native grouping support at graph levelGeorgios Pinitas
Change-Id: Iedc91b0aee743b59af5140c8acb8124548da3163 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144362 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1509: (Nightly) CLDeconvolution fails for QASYMM8Michele Di Giorgio
Using same quantization info and input values range as for ConvolutionLayer. This needs further investigation to understand why there are mismatches when using the entire range. Change-Id: I8c20a341b29a1ac03c811d014911e7efc484c3a6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144340 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1060 LSTM FP32 NEONMichalis Spyrou
Change-Id: I0bdf874e61917903c26f713ec41a7ffc29e07233 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143892 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1480 Add support for NHWC QASYMM8/FP32(non-optimized) to NEON ↵Giorgio Arena
DepthwiseConvolution Change-Id: I751f5d3fb74085d2e67f610ecf52da4736d0cfb5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143870 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188: Fix subtensor checkGeorgios Pinitas
Change-Id: Id8366a1d828e2f1a729c70bac1fb232182d59c0c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144382 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1366 Implement NECopyMichalis Spyrou
Change-Id: I183e4b7081bf12de3546293a00da68b4f4a0dd5e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143987 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1506 NPY Loader doesn't work for NHWC pipelinesMichalis Spyrou
Change-Id: I696fcded606e82a91526a9471f16fa2d1226ff4f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144144 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1188 - Fix CLWinogradConvolutionLayer for NHWCGian Marco Iodice
Change-Id: Ib4abe0388f218276e79f7c4405827e61722f0ef8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144240 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188 - Enabled NHWC in graph_squeezenet_v1 for NEONGian Marco Iodice
Change-Id: Idb8eb689f0791ef7e33c416ff61b675651db3349 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144223 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1498 - Enable grouping in CLGEMMConvolutionLayerGian Marco Iodice
Change-Id: I15c7df21773145b03f42b6f78bd7ad2e5b8a5219 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144126 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1509: (Nightly) CLDeconvolution fails for QASYMM8Michele Di Giorgio
Increasing the absolute tolerance as values seem to differ by at most 2. Change-Id: I7f70f432760b64ee6c96a5fdeb34865c0f8f4796 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144154 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-145 : Create ResNet v2 graph exampleGeorgios Pinitas
Change-Id: I6ff3d227321d8c3914f90ba4fc496b2fc122845c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144070 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1376: Add support for QASYMM8 in CLDeconvolutionLayerMichele Di Giorgio
Change-Id: I13ec79b6668e2b9559d3fa789ae0b51ab6975289 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139126 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1504: (Nightly) Segfaults on CL and androidGeorgios Pinitas
Keeps a copy of context in Scheduler to avoid releasing KernelLibrary resources before Scheduler resourses leading to a segfault. Does not exactly revert COMPMID-1122 as it still tries to keep context in sync. Change-Id: I3deb6bc1725b80f65f51ebd34d536f612ef6dd86 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144024 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1456: Create mobilenet v2 1.0 224 graph exampleGeorgios Pinitas
Change-Id: I26533af88aebe4bd9692ee1cdcd24eca34acea32 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143984 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1499: Fixed issues to build for FP16 on AndroidAnthony Barbier
Change-Id: I7cd15e9115b5c6f544005528d69061751286be11 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143708 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1246 Remove unused window iterator from NERNNLayer.Michalis Spyrou
Change-Id: Ia1ab755f85adb602c115f20e384fb459d3f91927 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143894 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188: Assign correct ticket to TODO in NEDerivativeKernelMichele Di Giorgio
Change-Id: I57bbfb79090fd57c57fdedd24a26736b272ea2f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143893 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1500: (Nightly) CLIm2ColGrouped std::bad_alloc and crashesGeorgios Pinitas
Decrease large sizes as it leads to std::bad_alloc for some shapes Change-Id: I274ceb65411c0ddef87f11135d7fdddfc89c7651 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143877 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188 Remove some FixedPoint leftovers from testsGiorgio Arena
Change-Id: I9e9b267ea58fd2339467af6f49ae76e9195cbc61 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143682 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1478: Fixed Doxygen comments + minor fixesAnthony Barbier
- Allow check_bad_style.sh to only run on some of the files - Pass missing lws_hint() in CLNormalizationLayerKernel Change-Id: I2cf44f82f7ba6c8dc8d40691aeec7c6c3de385b5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143628 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1473: Added missing TypePrinter for CPUModel, added accessor for ↵Anthony Barbier
number of CPUs Change-Id: If81d58b83143129bed91b9c6658b0cd4e623bc38 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143664 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1485 - Add support for NHWC when running NEGEMMConvolutionLayer with ↵Gian Marco Iodice
FP16/QASYMM8 When the GEMM3D check fails, now we fallback to the classic implementation with im2col and col2im. In this manner the function can work with QASYMM8 and FP16 Change-Id: I359e9da3a63956f33b5acbc9bca4383b14af10e2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143372 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1343: Add grouping support to CLCol2ImKernelMichele Di Giorgio
Change-Id: I5188a2163e7341f1915d98c21464fea13a9a7faf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143330 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1342 Add grouping support to CLIm2ColKernelGiorgio Arena
Change-Id: I4afb19751520a90fee27fb49b775cd10e92a94f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140476 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1478: Stop relying on static default OpenCL objects in cl2.hppAnthony Barbier
This causes problems when ACL is used as a shared library on Android. Fixes some problems related to creation / destruction order between the Graph's CL backend and core / runtime Change-Id: I716d63fd42f4586df1ffbb6fa97e4db06d3a781b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143228 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1188 - Passed WIDTH_OFFSET at compile time in ↵Gian Marco Iodice
CLWidthDepthConcatenateLayerKernel Change-Id: Icab813cd432174608621ee6a87015aeb10ab822d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143570 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1496 (Nightly) Mismatches in CLMeanStd function for FP16Michalis Spyrou
Changed RelativeTolerance to Asbsolute for F16/F32 as the values can be very close to zero for large inputs. Change-Id: Ibeab9f4e4d218e4ceaad00b1725acc34e80c7afb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143576 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1410 (Nightly) Mismatches in CLMeanStd function for floatMichalis Spyrou
Change CLReductionOperation border to be multiple of 64 instead of 16. The opencl kernel works only with local_size(0) being a power of 2. This will generate a padding of 63 if input_width % 64 = 1, but I don't think it's a big issue and it keeps the border calculation pretty simple. Also, increased tolerance for fp32 because there were mismatches for the 4K image. Change-Id: Id44990a262b2d6eff4c8ce56eb7c886274d9847e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143415 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1486 - CLGEMMDilatedConvolutionLayer FP16 / FP32 failing in nightliesGian Marco Iodice
Wrong boundary condition in the im2col3x3_nhwc kernel Change-Id: I83e9dd9b425fd0e3227decb1da3d08a3f5e2536d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143489 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02MLCE-13: Sanitizing matrix argument in the Warp.Pablo Tello
This changes help to prevent errors like passing a matrix with less elements than required into the warp functions. Change-Id: I863f933a5e0568258717cffed3a20788d3d03083 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143044 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1416 Fix Arithmetic Addition ReferenceGiorgio Arena
Removing support for uint8_t (QASYMM8) in the reference function that accepts dst_data_type should be enough. Change-Id: I46a43facf25463a8cbd3c5d5820c2cc06259ff10 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143399 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1488 - Add support for NHWC when running CLGEMMConvolutionLayer with ↵Gian Marco Iodice
QASYMM8 Fixed also a bug in the graph API related to the bias shape in DepthWiseConvolution for NHWC Change-Id: I275141a42e51f6747b77db1c31d1bc69e8685af5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143454 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1487 - CLIm2Col QASYMM8 failing in nightliesGian Marco Iodice
The flag "ChannelsFirstOutputNHWC" was not set Change-Id: Id5f64a839d4e86638a07090e971a4f7ee82af349 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143457 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1478: Updated OpenCL headers to the latest Khronos onesAnthony Barbier
Change-Id: Ie26b78c9da635206c96111ea490ac565063838ba Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143408 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1479: Fixed non fortran_order NPY loading.Anthony Barbier
- Reverse dimensions when loading a non-fortran order tensor - Support saving tensors with arbitrary number of dimensions (Not just 2) - Fixed a minor bug in SONAME generation Change-Id: I36aa0b05c9d3568d1296da2d84d5e299b40459cc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142794 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1303: CLDepthConvert : Add support for FP32 -> FP16 and FP16 -> FP32 ↵Michele Di Giorgio
+ validate() function Change-Id: I6808de0254a7c4bca440322cc14b795b3b32465b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142427 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1188 Fix WeightsReshape docGiorgio Arena
Change-Id: If15e06ad3aa092d32c4d88172a9fea79a7416b2b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143128 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188 - Removed the multiplication by 4 in NEGEMMInterleavedWrapperGian Marco Iodice
Change-Id: Iaf8519bc483b947876a9b6ba83b4eb43b45b83a1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143135 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1243 Update doxygen to use Android ndk r17bAnthony Barbier
Change-Id: Iea248dca88828669b680aeacbbf2b359d2bed304 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143143 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1277 - Optimizing CLIm2ColKernel for NHWC.Gian Marco Iodice
This patch includes: - Im2Col optimizations for NHWC using a new data layout - Refactoring of CLIm2ColKernel adding validation method and auto-init - Removed im2col_reduced from CLIm2ColKernel and created a new kernel CLFlattenLayerKernel Change-Id: I1620640b6796baa268324b33ae92cdd8de53e27c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141241 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1188 - Fixed files extensions in GraphUtils.hGian Marco Iodice
Also fixed the calculation of num_elements in access_numpy_tensor Change-Id: Ic1a394ff829746d7803b81360830bade63b6b82a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143132 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1344 Add grouping support to CLWeightsReshapeKernelGiorgio Arena
Change-Id: Idde333308db71087ec234b3fd1eb4e36a44db46c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143049 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1481: CLCannyEdge still failing in some precommitsMichele Di Giorgio
Without the check introduced by this patch, all weak edges as marked as strong edges. Change-Id: I874ebf22c06707bd98bd11b9be93602bfcbafa7c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142922 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1188 - Fixed performance degradation with GEMM3DGian Marco Iodice
The previous implementation of GEMM3D degradated the performance when the input had to be reinterpreted as 3D. However if both input and output have to be reinterpreted as 3D, we can skip the offset calculation for that specific case and run the multi GEMM approach Change-Id: I0d5d48add2c6ccdebfbb268ea199dd181101f3aa Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142872 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1248 Enabled memory manager in NEWinogradConvolutionLayerAnthony Barbier
Change-Id: I7bbab53f18a42f0879d80122a52bb6bdca4b8631 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142413 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1406: Refactor gemm_interleaved to use our own types and schedulerAnthony Barbier
- Ported PrepareB kernel from gemm_interleave - Ported TransformA feature from gemm_interleave - Allocate reshaped a and b buffers - Added memory_manager / memory_group - MatrixMultiply kernel - Interleave kernels execution. - Fixed a few bugs: all nightly Convolution tests passing for threads=1 and threads=4 - Added Doxygen documentations and comments in the code - Added support for all data types supported Change-Id: Iffa1c09fda0bb9c61213bb83524d5a48e7ecb03c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141281 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-872 - Rework NEGEMMConvolutionLayer to use NEGEMMGian Marco Iodice
Change-Id: I55f0018ac7214775ebbca63f58a3bf5c93732fec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1475: (OCLGrind) FP exception in NEGEMMConvolutionGeorgios Pinitas
Change-Id: I986099c269498cc7971b10ee634dba721954546e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142647 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02MLCE-36: FC tranpose weightsGeorgios Pinitas
Change-Id: I3b8a6c00e61ba6da459ca5fc7275393f9d073aed Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142533 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>