aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-11-02COMPMID-893: Fix segfault in graph_squeezenet_v1_1Georgios Pinitas
Number of threads in the scheduler were set after example configuration leading to memory corruption in case more threads were used. Change-Id: I221a026196cd64f1805d31596f0488a247a4bfab Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119196 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-11-02COMPMID-895 - Optimizing CLDepthwiseConvolution3x3KernelGian Marco
This patch brings the MACs utilisation up to 25 % when both stride_x and stride_y are equal to 1 Performance reported in the following confluence page: https://confluence.arm.com/display/MLENG/Depthwise+convolution+3x3+FP32+performance%3A+ACL+18.02 Change-Id: Ida1b64be9a88805902a3d90194559b58eb1224a3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119068 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-894: Segfault: neon_cnn on S5 neoGeorgios Pinitas
Removed double managment of the same tensor object Change-Id: Ibc74cd8c7bd199cd473ff68f692840cbf01b27b3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119119 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-11-02COMPMID-765 - Added LWS hint in CLIm2ColGian Marco
The LWS hint has been applied for optimized cases 1x1 and 3x3 Change-Id: I6b4bfe2f9f7da627052336889b8a18d279fe2675 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119162 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-896: Replace legacy 4x4 u8 GEMM kernel with safe version.David Mansell
It's not safe to accumulate two u8xu8 results into a u16 accumulator. This changes the kernel to use uadalp after every single multiply. Correct the test fixture as well. Change-Id: I011b90033c4673e55b843d079e3f7d185b1df330 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119096 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-784: Added support for PADDING = SAME in Winograd layer.Pablo Tello
Change-Id: I5a420da6a8041f9ff6d0811815f2fc74c85c56a8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119014 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-891 - Use OpenCL timer in CLTunerGian Marco
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-855 - Optimizing im2col on OpenCL (DCHW)Gian Marco
Introduced optimizations for 1x1, 3x3, 5x5 and 11x11 Change-Id: Ibb7f7a9fbec01a7684746ed8513634078126e452 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118107 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-754: Add validation method to CLPermute kernelMichele Di Giorgio
Change-Id: If6f3888a035b557a6c369efa22b56d6c8d3efbd3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118789 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-765 - Used GEMM instead of Direct into the GoogleNet graphGian Marco
example Change-Id: Ic639d51fb5dd4f78912a9b11abc7df79d205a22b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118843 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-889 - Implement Squeezenet 1.1 as a graph exampleIsabella Gottardi
Change-Id: I12d4af007c123b19925ceb5e3c84285e096bc13b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118718 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-874: Improve default number of threads choice in the SchedulerGeorgios Pinitas
Change-Id: Ia30ec2afce0aafcd39f41440efb972b18bbda9f8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118657 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-890: Valgrind: NEON Convolution Layer validation failsIsabella Gottardi
Change-Id: I5296815cf04e5f805d6523196567b6c01715c8b5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118711 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-871: Remove vst4q/vld4q from NEActivationLayer.Georgios Pinitas
Change-Id: Iebd2a8fece1af87c93d6795e176d8c37ca64bbf6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118187 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-787: Add CL support for broadcast multiplyMichele Di Giorgio
Change-Id: I71f67789648ef05ccdedce77c7427bc0127b3a69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116741 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fix direct convolution output stage.Georgios Pinitas
Change-Id: Ie4ac7f61675c1fb9b1748d6784fccb26f058832a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118635 Reviewed-by: Robert Hughes <robert.hughes@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fixed clangtidy warnings.Pablo Tello
Change-Id: I83d0f2bc8e0ebfdc0b60931f2c5acf0469caf886 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118696 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fixed CPU detection code to read larger buffer (was failing on ↵Pablo Tello
Arm Cortex-A55 FPGA with 8 CPUs with lots of flags). Change-Id: I493fb1013c6c25d9b9c809705b1ee24abac1d8d1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118456 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-784: Winograd tramsforms refactoringPablo Tello
1) Removed the example files winograd_layer.hpp/cpp 2) Teplatized winograd transform kernels Change-Id: I7045fa0b801b9d30a11275914aaa2dafd254aed2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118332 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 Small fixes to MobileNet dwc datasetGiorgio Arena
Change-Id: I14ff5e2964328d22c0bba5a77683e07f0c7920e9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118389 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-400: Implement the tensorshift kernel for fixing DC's alignment ↵Xinghang Zhou
issue on OpenGL ES Change-Id: I7a8489bb0fddc72899ea165e414ee87bdbfb45b3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118106 Reviewed-by: Joel Liang <joel.liang@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-832: Decrease validation coverage of QS8/QS16Anthony Barbier
Change-Id: I5366d11aefdb8f3ba7326ed7527eb216c4de0668 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118372 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02APPBROWSER-394: GLES fails to compile Dropout kernel on S5steli01
Change-Id: Ie480332e6e302edd406627e90be0d7df3e61dde5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118303 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Added tolerance of 1 for CL Convolution S16 while the issue is ↵Anthony Barbier
investigated Change-Id: I5a69198bfd60d9cdd061f2db9838d9f0df9ecc23 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118454 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02IVGCVSW-863 Broadcast support in CL/NEON Arithmetic AddDiego Lopez Recas
Also, added instrumentation to support generic tensor broadcasting for NEON and CL backends. Change-Id: I1bc5747a286e1a4b464c209067581e103d473b9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114201 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-866: Integrate SGEMV Neon Assembly from RSHMichele Di Giorgio
Change-Id: Icbb43de7642e2b433d7471d70b9dbbde850989d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118197 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-765: Allow RSH's code to not have default cases in their switchesAnthony Barbier
Change-Id: I2d3cc9668852a1ba414fc3148866df408f770dc8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118308 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-390,397,398: bugfix and fully connected validation issue on ↵zhenglin
specific dataset Change-Id: I227e90445715c3bd394e49930b010c0a5f5ca177 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118108 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Joel Liang <joel.liang@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-861: updated RSH Gemm's transforms.Pablo Tello
Change-Id: Ic1f215c1ae85ad5c516cc3600447a50bba77ebc1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117668 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-815: Fixed Winograd 5x5 padding bug.Pablo Tello
Change-Id: I38ae204632ae27c5fe7a0131462343397899868c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118120 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-833 Direct convolution, Normalization andMichalis Spyrou
Fully Connected test names are not unique Change-Id: Ie4654cc1cb4720c51a3114162043562d5cbc6d28 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118126 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-564: CustomConvolution Test Name updatedSanghoon Lee
Change-Id: I880ac3a1c3f5ea09ccefe27d9ee40bd60afcea2b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118056 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765 - Added third dimension for CLTunerGian Marco
Change-Id: I0a7ea4cde1dbf8edd28908dfff80928ef7e996c4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117647 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-588: Port Equalize Histogram to new validationJohn Richardson
Change-Id: Iff50adf2993bd69c2696a47559d6b2e0011fed87 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110177 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 Fixed missing cast that was breaking the bare metal buildAnthony Barbier
Change-Id: I80437f7ba6e4b8ec1fb145300a017b3688f3f2b6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118086 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-837: Fixed remap tests failures in Valgrind.Pablo Tello
Some minor improvements in the test fixture, for example making sure the values in the mapx and mapy tensors are in the range of [-5, in_width+5] and [-5,in_height]. Tolerance was changed to 0, no mismatches expected. Change-Id: I2fad06defb293bf9fdd1988799b19547c102dee5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118044 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-395: Random error in FullyConnectedLayersteli01
Change-Id: Ic460695b8a203c1080ea177b5463b48b07b70c4b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118075 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Joel Liang <joel.liang@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02IVGCVSW-798 Add Softmax NEON support for QASYMM8Diego Lopez Recas
Change-Id: I4f2cca52caf210fdb7d6bb7e9436ac51cb5088b4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112398 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-564: CustomConvolution issue fixedSanghoon Lee
Change-Id: Ia2874d30780cb597a6e5039120815f2368911e0c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118024 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-784: Added support for biases in WinogradLayer.Pablo Tello
1) Updated to the latest code from the RSH repo. 2) Moved winograd transforms into kernels. 3) Added support for biases Change-Id: I7f39f34a599b49d7d9b549cc10a4f4d4a8007ab8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117474 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-791: Generic Depthwise Convolution Layer NEON QASYMM8Georgios Pinitas
Change-Id: I33cf54e68f6c097ac58b6f16c3f9a720978f09cd Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117289 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-863: Only output (end-start) for OpenCL timersAnthony Barbier
Currently we output an array of timestamps: queued, submitted, start, end This patch instead only output end-start (i.e the time it took to execute the kernel on the GPU) Change-Id: Ic3c2b68128f6acd6bb018b7b3ead0b69dd5aca59 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117865 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Kevin Petit <kevin.petit@arm.com>
2018-11-02COMPMID-790 - NEON: Add QASYMM8 support to ConvolutionIsabella Gottardi
Change-Id: Iec82a91ad351cfe8d07d0976a24bd42f4703177a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116833 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-765: Clangtidy warningsPablo Tello
Change-Id: If8c1e0103ae2e3dfde3d0b9f23575c0e904c7f30 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117961 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-863: Remove some of the post-processing from the JSON backendAnthony Barbier
Refactored the console printer too (So that we can re-use the code if needed) Change-Id: I16a0f70104f82f07cd59900b383038fa5a76e1bc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117858 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-834 Fix arm_compute_nightly_validation getting killedMichalis Spyrou
Changed CLReductionOperationKernel: Now each kernel computes a 2D slice instead of 1D. This reduces the memory footprint from around 1.6Gb for a 4k input image to a few Mb, which was caused by the __local memory and was probably the cause for this bug. Change-Id: I71ac71ff09b041c945a134177600f0f3475e48cf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117835 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-848 NEPoolingLayerKernel incorrectly reportsMichalis Spyrou
it supports asymmetric padding Add asymmetric padding support for NEPoolingLayer Change-Id: Ia5cc660aeca636c3c45df4916a28974cc2b7f2f4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117275 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-748 - Integrating optimized SGEMM for bifrostGian Marco
This patch introduces a new GEMM capable to improve the mac utilisation of 10% compared to the GEMM without reshape. However this implementation is not faster in all cases as we need to take into account the time for reshaping the matrices. For this reason an heuristic solution to select the optimal GEMM to use has been added to the function. More information about the heuristic implementation can be found at COMPMID-852. With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can improved the performance of 1.5x. More information about the performance uplift can be found here: https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02 Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fixed output accessor in LeNet example, and disabled colors ↵Anthony Barbier
when not running in a terminal Change-Id: I4ec90803c5dc41b0cee05c36113ae3f189564d58 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117831 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-564: Implement reference and CL/NEON validation for ↵Sanghoon Lee
CustomConvolution (output S16) Change-Id: Ic099336f558e994210a59e14ec0171fae68ccb80 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116663 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>