aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL
AgeCommit message (Collapse)Author
2018-11-02COMPMID-524 - Implemented CLTuner objectGian Marco
Change-Id: Idbdbecca1fc299ed042936119d90e2bed8db0938 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87101 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Fix CL compiler warningsMoritz Pflanzer
Change-Id: If2c90f7352bff64abbf2faec7f33340e6873b5cd Reviewed-on: http://mpd-gerrit.cambridge.arm.com/89020 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-541: Fix padding in CLMinMaxLocationKernelMoritz Pflanzer
Change-Id: Ie17e3f14c428553d433da2a564e016bfac7749a9 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/88881 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-417 fix the depthwise conv bugsteniu01
Change-Id: Ica3c26d09f8009240467e0d3a12f585170fbcd44 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/88677 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-515: L2 Pooling for FP32/FP16 in CL.Georgios Pinitas
Change-Id: I43641fa672f5905ca62edd1f63fc93e0cf7ea382 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85963 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-417: Fix clRetainEventMoritz Pflanzer
Change-Id: I0a52b2d4f177f0b0ae67e9674ff39a9ae30452b9 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/88457 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-516 Increase tolerance rate of Scale, Conv, fully connected and GEMMsteniu01
This patch also fix the scale kernel issue where it was calcuated the scale factor inside the gpu but now in the CPU. The GPU and CPU gave different result for simple float division operation Change-Id: Ib6709cb6c41dcf4fc0fa4eb79e481430695bf40e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87266 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-452 CL Generic Depthwise Convolution implementation.Giorgio Arena
Change-Id: I115e48fe6ce5e281f3791aa5d80fdc754cdd2b5e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85082 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-522 - Added support for GlobalPooling in CLPoolingLayer and ↵Gian Marco Iodice
CLFlattening for 3D tensor Change-Id: Ifc7db1e4d4af322a4dcbfeb3e132e5c326596872 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86618 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Add support for floats in scale.Georgios Pinitas
Change-Id: I7d714ba13861509080a89817f54e9d32da83e970 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86026 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Fix CLNonLinearFilterGeorgios Pinitas
CLNonLinearFilter 5x5 was reading out-of-bounds for cross and disk masks. Makes sure that read is in bounds and elements are shifted after. Change-Id: I57a611e24cc9cadd50a36881e408a5a0d4ea5a3d Reviewed-on: http://mpd-gerrit.cambridge.arm.com/88056 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-417: Refactor GPU detectionMoritz Pflanzer
Change-Id: Ia21f2c51cb3bb4390b0ad26590bca63ac8446e17 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87927 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Fix setting CL targetMoritz Pflanzer
Change-Id: I4a8fc2ca55d6702ab2730de1012d6ef223395ef5 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87904 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-462: Implement TensorReshape for NEON and CL.Georgios Pinitas
Change-Id: I11b39c2ceca26ade73822e29a384ef866ae05729 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87707 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417 Fix reduction kernel's __local buffer sizeMichalis Spyrou
Change-Id: If97a79d86b174b1d9b41360303d624e3b2d22001 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87703 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-448: Implement CL Quantization/Dequantization Layer.Michele Di Giorgio
Change-Id: Id002e23a2ac48af3d245416dc6411d9a04a1e513 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81827 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-485: Memory ManagerGeorgios Pinitas
Change-Id: Ib421b7622838f050038cd81e7426bb1413a7d6e6 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87376 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-477 - Optimized CLDirectConvolution1x1 for BifrostGian Marco Iodice
- Fixed bug in CLDirectConvolution3x3 Change-Id: Iaf34ef44f0b7bc02e66f3eb4452ff7a90ef83523 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86725 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-417 Fix ROIPoolingSiCong Li
* Fix ROIPooling in NEON, CL and Reference. Change-Id: Id5066625e5073e0bfebe69391f7941e993003296 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87435 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-514 (3RDPARTY_UPDATE)(DATA_UPDATE) Add support to load .npy dataSiCong Li
* Add tensorflow_data_extractor script. * Incorporate 3rdparty npy reader libnpy. * Port AlexNet system test to validation_new. * Port LeNet5 system test to validation_new. * Update 3rdparty/ and data/ submodules. Change-Id: I156d060fe9185cd8db810b34bf524cbf5cb34f61 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84914 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-415 - Fixed bug in CLDepthConcatenateKernelGian Marco Iodice
Change-Id: Ieedb714cb3666504c175aa488505e0485778c589 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86705 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-519: Add support for Lower and Upper Bounded RELU for CL/NEONGeorgios Pinitas
Change-Id: I7b16216ac59c899a33942bf17757b54535256d7a Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86172 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-424 Implemented reference implementation, new output valid region ↵Isabella Gottardi
and validation tests (NEON and CL) for Scale Change-Id: I056fa3588b807a97cacf0b8afaec56e37ffc92af Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83872 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-476 L2 Normalization for CLMichalis Spyrou
Change-Id: I88f87173645880eb823916c5d4ac884c372a4fb4 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83269 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Fix invalid read in CL GEMM accumulate biasesMoritz Pflanzer
Change-Id: Ie7786a29faa0d98d8ad65c2333d0d6a1665340bc Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85635 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-358 Implement OpenCL ROI PoolingSiCong Li
* Implement OpenCL ROI Pooling * Add CLROIPoolingLayer benchmarks Change-Id: I8786d01d551850a1b4d599a48fabe3925e0a27d0 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79833 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-477 - Optimized batched case in CLConvolutionLayerGian Marco Iodice
Change-Id: I4ef18f49f1da0cb816aaa0762466b940792c15ed Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84162 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-417: Port PoolingLayer to new validation.Georgios Pinitas
Change-Id: I7f2f5f5f81ad9932661fc4c660bf90614288bc96 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85270 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-513 Choose maximum local workgroup size at run timesteniu01
Change-Id: I9ab3cf6dc92a93b0ae5f746e078355e443b3a545 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84906 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-452 CL Depthwise Separable Convolution Layer kernel implementation, ↵Giorgio Arena
validation and benchmarking for 3x3xC depthwise filter and DataType::F32. Change-Id: I95c0c87709763cdbf58d0de66025eac86e30791b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82768 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-11-02COMPMID-477 - Optimized CLNormalizationLayerGian Marco Iodice
CLPixelWiseMultiplication has been removed within the function Change-Id: Ibe7edd7921d5cef6ff68fdeeca89771129a8eaea Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84459 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-431 Port OpenCL pooling layer to use fixed pointsteniu01
Change-Id: I6a73cd6582097aaefa83588aad789bdefdc74406 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79967 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-477 - Optimizing Pooling 3x3 with stride_x <= 3 on OpenCLGian Marco Iodice
Change-Id: Ie000166307cdb5bfae00ebf84d35e49a6bfb9dbd Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83372 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-417: Cleanup CL FullyConnectedLayerMoritz Pflanzer
Change-Id: Ic7191be1f136c6aad4037cf2ec4bc6d7d0e440d3 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83713 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-417 - Fixed call of direct convolution 1x1 for bifrostGian Marco Iodice
Change-Id: Ic4e56e8881b8c66758e67c486514ec397cf43f8e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84592 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-477 - Optimized Direct Convolution 3x3 and 5x5 (f32) for Bifrost.Gian Marco Iodice
Each work-item computes 4x3 output elements in case of 3x3 convolution and 4x2 in case of 5x5 convolution Change-Id: I6ebbaff8b7e971c1f90d5845c0b58d2a40f39df5 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84345 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Add in-place support for batch-normalization.Georgios Pinitas
Change-Id: I4b0c9348f3bc2addc198a76fadd1b583abf42b60 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84434 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-450 Add YOLOV2 benchmark testsSiCong Li
* Migrate BatchNormalizationLayer to new benchmark system. * Add YOLOV2 benchmark tests. * Fix F16 type issue in activation_layer cl kernel. * Separate precommit tests from nightly tests. Change-Id: I3f206e3f7469be6749d630ede8dcc9fb399de8b0 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81582 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-478 Implemnt CL direct convolution 5x5steniu01
Change-Id: I4b975aff310cda9964d8c5dcee182d5d5c82741b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83474 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-474 - Add support for QS8/QS16 DirectConvolution CLMichalis Spyrou
Change-Id: I537e4acbc02c8d880ff8630ea62223e0f1a1dda3 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82875 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-424 NEON/CL Harris Corners validation tests.Giorgio Arena
Change-Id: I82d2a73f515a8d45d16b9ddb702fea51ae05c82e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79687 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-417 - Fixed bug in CLCol2ImKernek related to the stride passed ↵Gian Marco Iodice
during the configuration Change-Id: I9818f72e5ddd0d21f6700c651fc968ff61507424 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83909 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-417 - Added clFinish to CLSymbolsGian Marco Iodice
Change-Id: If3ee89d91f105489c766b9e714fdf72da8fbfe78 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83664 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-459 Collapse CL Im2col's higher dimensionssteniu01
Change-Id: I0ccc39cbcf6926e6810faf3fe264c4af7adc3f7b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83070 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-477 - Optimizing CLDirectConvolution 3x3 on OpenCL and added the ↵Gian Marco Iodice
auto configuration Change-Id: I3c8384dcbc9d7786943134bb658dafb35356d90d Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83253 Reviewed-by: Steven Niu <steven.niu@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-415: Add autoconfigure to CLCol2ImKernelAnthony Barbier
Change-Id: I50c114d0c78d443a21bf43aa36a370474f0769ce Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82955 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-09-17COMPMID-417: Fix CLNormalization error issue.Georgios Pinitas
Change-Id: Ie538245ee0451e4cdb28120e80b9a65f56a07e7d Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82933 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-472 : Implement Floor for CL and NEON.Georgios Pinitas
Change-Id: I675a4545b1fe9ab665a07c834720bfe7ff589cee Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82527 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-443 collapse higher dimension for CL col2im kernelsteniu01
Change-Id: I99d41c7c95b8d4e3cd5c1685c68936b6a2db4192 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81885 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-417 NEON/CL MeanStdDev bugfix using FillBorderKernelGiorgio Arena
Change-Id: Ic48ba7f69783d0e1e80611264e2bc67d1732436e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81293 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>