aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1400: Add command line option to specify the tuner's config fileAnthony Barbier
Change-Id: Ib597e0dff4c8c01f7e6bd46d03824beef4bc1e9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139923 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-807: NHWC support in CLDirectConvolution.Pablo Tello
Change-Id: I8738aca2cc0104e4c4d7c9605762ab59fce10a33 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137333 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1271 Avoid memory leak in list of gemm methodsAnthony Barbier
Change-Id: I80764d09bf5fb87b3a98bc0e1803d25c6c682c1f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139859 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1288 Optimizing CLGEMMLowp using 8 bit dot product instructionGiorgio Arena
Change-Id: I536174b9381660a94578d6aa1892a6289a820391 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139109 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1373 Enable tests for CLWinogradConvolutionLayer for NHWCGiorgio Arena
Change-Id: I2c6a744f174cfb6c78a9923b737f06537debaa0d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139758 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1399: In ITensor::copy_from, line_size contains twice num_channelsAnthony Barbier
element_size = data_type_size * num_channels line_size = element_size * dimension[0] * num_channels Change-Id: I72dd2ed7ee1f461f86232955c11e002a706cc89a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139833 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1381: Fix nightly build failure on armv7Anthony Barbier
Change-Id: Ic8e238a03361c04cc0b9daada81d63e5e423d2b8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139825 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1390: OCLGrind and benchmark tests fail for QASYMM8Georgios Pinitas
COMPMID-1392: OCLGrind failures in im2col1x1_stridex1_dchw COMPMID-1395: OCLGrind failures in output_stage_quantized Change-Id: I35504bd1f701316df122be52d458c71bbd7e7909 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139722 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1381: Cleaned up the AssemblyHelper interfaceAnthony Barbier
Introduced a new IFunction for when we'll fork the arm_gemm functions Increased encapsulation and abstraction of which method is used Change-Id: I5fd8b14b5c77e7f8ecb09029b5e2eccd10dbdcf4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139108 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1226 Extend CLMeanStdDev to support FP32 / FP16Michalis Spyrou
- Extend support for FP16 in CLReduction. - For F16/F32 MeanStdDev we perform one reduction operation for mean and one for stddev and we calculate the final result in the host CPU. Change-Id: Iad2099f26c0ba7969737d22f00c6c275634d875c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135870 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1388: Change default CLTuner to the one for the detected GPUGeorgios Pinitas
Sets a default tuner for the detected target if no tuner is specified in default_init() Change-Id: I27f1b9bbc0df91c1940315c6cc9042720cd1d3fe Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139630 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1167: Validation for NEDepthwiseConvolutionLayerAbe Mbise
Change-Id: I9689e1a0627dc015dd2ce98417e4c97bb55581bb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131327 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188 : Rename TNOX to Mali-G76Georgios Pinitas
Change-Id: I136f7aa4bca268abd4fbe4f6ce4bcc2708ec3671 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139689 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1310: Create graph validation executables.Georgios Pinitas
Change-Id: I9e0b57b1b83fe5a95777cdaeddba6ecef650bafc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138697 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1271: New system for GEMM heuristicsDavid Mansell
This patch implements a system for separating the "validity" from "preferred" aspect of the current heuristics in gemm_*.cpp. Now, each gemm_*.cpp defines a list of candidate implementations, each of which supplies an is_valid() function (to check for validity), an is_preferred() function (the "heuristic" part), and an instantiate() function which actually produces the GemmCommon object pointer. The actual gemm() function is now templated and uses this list to select an implementation. This patch also implements a mechanism to identify the preferred implementation, and override it via the GemmConfig structure. Change-Id: Id49ab7af8bf2e3e9fd951a9698883ade234d40e1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139120 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1384: graph_mobilenet fails for NHWC on OpenCLGeorgios Pinitas
Makes GEMM3D account top padding when jumping among planes. Change-Id: Ia7c16cfa5498de106774ce42cbc4716e9f43195b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139612 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1383: OCLGrind failure in CLDepthwiseConvolution 3x3 stride 1 NHWCGeorgios Pinitas
Seems OCLGrind to operate wrongly on some intrinsics when there is a mixture of vectors and scalars passed to it. Change-Id: I9e3782e739603ec59bacc3c77d91a70b1899fe3e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139474 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1246: Fixed typo in MidgardTuner.h filenameAnthony Barbier
Change-Id: Iade03ee67939f15a455723346a4ee0a890a8278e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139539 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1387: Removed unnecessary extended_atomics extension for ↵Anthony Barbier
clMeanStdDevKernel Change-Id: I47e55d61a9d9464ef63b0da890e06f8a0b434796 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139527 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1124 : Fixes in CLLSTM layerGeorgios Pinitas
Change-Id: Ifc8e12c296d3ef2bf8e0f0bf1b87b7fd47a1fad7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139248 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Ruomei Yan <ruomei.yan@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-1340 - Implementing Winograd Convolution Layer 1x5/5x1 on OpenCL NHWCGian Marco Iodice
Change-Id: Id5e0795238f77c049df9c109dafc5ef878c1897d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139234 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1378: Graph: bias in FullyConnectedLayer should always have data ↵Michele Di Giorgio
type S32 Change-Id: Ib4a82b11e810664c663dfc66281ff6bf44d2aa6b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139029 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1349: Add support for QASYMM8 LOGISTIC activation in CLActivationLayerMichele Di Giorgio
Change-Id: Ibabce61cf5427de80078a6468023bed05f5e7c2c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139006 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1380: Pre-work for SVE support.David Mansell
This patch makes the needed infrastructure changes to allow SVE kernels to be added later on. Change-Id: Ide5bccac2f47278e93fff3d648231aee2d5f8c2e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139070 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1337 Implementing Winograd Convolution Layer 1x3 and 3x1 kernels on ↵Giorgio Arena
OpenCL NHWC Change-Id: Ia07e0dfcbcd07366c4bcb956e298369fb12a0369 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138759 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1339 - Implementing Winograd Convolution Layer 1x5 and 5x1 kernels ↵Gian Marco Iodice
on OpenCL NCHW Change-Id: Ia293cd89651146a0e27e5f7c74ca9c924807e83c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138707 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1246 Change CLLSTM in order to match android testsMichalis Spyrou
Allow cell to input weights to be nullptr if CIFG and peephole are both enabled. Change-Id: I6df705d69551f0fddeedd41b2044278d4575469c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137902 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1246 Fix CLDeconvolution arguments when calculatingMichalis Spyrou
the output dimension in configure Change-Id: I0e5044e5596c065647d0e913fe9122bf6fd9e205 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138861 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-970 : Remove QS8 / QS16 supportVidhya Sudhan Loganathan
Removed QS32 references Change-Id: Ic7df02c08ae7aa1b7dcae15bdda113321af851b8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138703 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1374: Fix constraints on AArch32 SGEMM.David Mansell
The "cc" constraint was missing on the a53/a55r1 versions of this kernel. Added "memory" to these (and the generic kernel) as well for safety. Change-Id: I4df1b2fde43c20550ba7a51436b326f5e9e9871f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138812 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1355: OpenMP compilation is broken.Georgios Pinitas
Change-Id: If821ff02a68551d2181b2b7fdc3028cb5343341f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138150 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1369: Revert accidental formatting of RSH's repoAnthony Barbier
Pulled latest fixes from David's repo: commit f43ebe932c84083332b0b1a0348241b69dda63a7 Author: David Mansell <David.Mansell@arm.com> Date: Tue Jul 3 18:09:01 2018 +0100 Whitespace tidying, fixed comment in gemv_batched imported from ACL. Change-Id: Ie37a623f44e90d88072236cb853ac55ac82d5f51 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138530 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: David Mansell <david.mansell@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-970 : Remove QS8 / QS16 supportVidhya Sudhan Loganathan
Removed fixed point related code. Change-Id: I487acf138dace3b0450e0d72ca7071eaec254566 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137678 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1363: Handle null bias pointer in DirectConvolutionOutputStageKernelAnthony Barbier
Change-Id: I412db69f31811fe4a7d262a0f146ef545731fc02 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138500 Reviewed-by: Aron Virginas-Tar <aron.virginas-tar@arm.com> Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1323: Added proper messages in graph NodeValidators.Georgios Pinitas
Change-Id: I5f66c897c552c41c12bbc2244f866be93d5032d0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138393 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1352: Disable support for 4D softmax layer.Georgios Pinitas
Change-Id: Ia8afabb36e644895d321ded51a6a0676347443e1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138387 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1358: Debug FB's graph example on Android.Georgios Pinitas
Change-Id: I007b8eed66789b95aa9c08a96b6253472995168e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138332 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1338 Split winograd.clGiorgio Arena
Change-Id: I583227fc1a38b1a34de253e383d71cca66007f18 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138273 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1350: (Nightly) Fix GEMMConvolutionLayer FP32/FP16 CL failingGeorgios Pinitas
Change-Id: I8e8dee355bbf708cc3abb22de867f848a22dccd6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138022 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1325 Add data layout suffix to the config_id on OpenCL kernelsGiorgio Arena
Change-Id: I78d7b4a53fe6525cc19fd49c5d555a4334e6de3b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137903 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-811 Add NHWC data format support for CL depthwise convolutionGiorgio Arena
Change-Id: I574f7945f0be009c638d860028bce8b52b4120fd Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136484 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1201 - Implementing Winograd Convolution Layer 1x3 and 3x1 kernels ↵Gian Marco Iodice
on OpenCL Change-Id: I39667bab49daa4da009694163274a59fd3574c73 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137595 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1336: Add CLArithmeticAddition support for QASYMM8Michele Di Giorgio
Change-Id: Ice2bb644841fdea4e776872ff5481eb927e66bd1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137714 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1334 (Nightly) Fix std::bad_alloc error in 32-bit NEON runsGiorgio Arena
Change-Id: I412420a4f02225708fcc8f446a5af5a9faf7d0a5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137846 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1345: Switched from using mutexes to atomics in the CPP SchedulerAnthony Barbier
Change-Id: Ie74bb71057027bca3b8a9b03b4a9f156d58b3253 Note: No performance impact as this part of the code is not currently used Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137807 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-809: Add NHWC data format on CLGEMMConvolutionLayer.Georgios Pinitas
Change-Id: I50e4f5e7d47e21c300f754bee2c216863075b5cf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136191 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1253: Nightly: Fix Canny Edge NEON failingMichele Di Giorgio
Change-Id: If0836522792717a843c1cab405afc9320ce53079 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137162 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1283: (GitHub issue) after convolution output data is zeroGeorgios Pinitas
During the mutating passes accessors of optimized nodes were dropped instead of being transfered to appropriate tensors. Change-Id: I29183984d94806bdfb5c92af3acefd928c0fd171 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136036 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1204 Add NHWC data format support to Winograd input transform 4x4_5x5Giorgio Arena
Change-Id: I3dffdd1772b78db27a4374f074a24a15a9552189 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134859 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1293: Handle aligned allocationsGeorgios Pinitas
Change-Id: I6e642c8cd968240f883c327464519e57e5d0c3e3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136088 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>