aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/kernels
AgeCommit message (Collapse)Author
2018-11-02COMPMID-448: Implement CL Quantization/Dequantization Layer.Michele Di Giorgio
Change-Id: Id002e23a2ac48af3d245416dc6411d9a04a1e513 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81827 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-477 - Optimized CLDirectConvolution1x1 for BifrostGian Marco Iodice
- Fixed bug in CLDirectConvolution3x3 Change-Id: Iaf34ef44f0b7bc02e66f3eb4452ff7a90ef83523 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86725 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-514 (3RDPARTY_UPDATE)(DATA_UPDATE) Add support to load .npy dataSiCong Li
* Add tensorflow_data_extractor script. * Incorporate 3rdparty npy reader libnpy. * Port AlexNet system test to validation_new. * Port LeNet5 system test to validation_new. * Update 3rdparty/ and data/ submodules. Change-Id: I156d060fe9185cd8db810b34bf524cbf5cb34f61 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84914 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-415 - Fixed bug in CLDepthConcatenateKernelGian Marco Iodice
Change-Id: Ieedb714cb3666504c175aa488505e0485778c589 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86705 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-424 Implemented reference implementation, new output valid region ↵Isabella Gottardi
and validation tests (NEON and CL) for Scale Change-Id: I056fa3588b807a97cacf0b8afaec56e37ffc92af Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83872 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-476 L2 Normalization for CLMichalis Spyrou
Change-Id: I88f87173645880eb823916c5d4ac884c372a4fb4 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83269 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Fix invalid read in CL GEMM accumulate biasesMoritz Pflanzer
Change-Id: Ie7786a29faa0d98d8ad65c2333d0d6a1665340bc Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85635 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-358 Implement OpenCL ROI PoolingSiCong Li
* Implement OpenCL ROI Pooling * Add CLROIPoolingLayer benchmarks Change-Id: I8786d01d551850a1b4d599a48fabe3925e0a27d0 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79833 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-477 - Optimized batched case in CLConvolutionLayerGian Marco Iodice
Change-Id: I4ef18f49f1da0cb816aaa0762466b940792c15ed Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84162 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-417: Port PoolingLayer to new validation.Georgios Pinitas
Change-Id: I7f2f5f5f81ad9932661fc4c660bf90614288bc96 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85270 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-452 CL Depthwise Separable Convolution Layer kernel implementation, ↵Giorgio Arena
validation and benchmarking for 3x3xC depthwise filter and DataType::F32. Change-Id: I95c0c87709763cdbf58d0de66025eac86e30791b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82768 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-11-02COMPMID-477 - Optimized CLNormalizationLayerGian Marco Iodice
CLPixelWiseMultiplication has been removed within the function Change-Id: Ibe7edd7921d5cef6ff68fdeeca89771129a8eaea Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84459 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-431 Port OpenCL pooling layer to use fixed pointsteniu01
Change-Id: I6a73cd6582097aaefa83588aad789bdefdc74406 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79967 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-477 - Optimizing Pooling 3x3 with stride_x <= 3 on OpenCLGian Marco Iodice
Change-Id: Ie000166307cdb5bfae00ebf84d35e49a6bfb9dbd Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83372 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-417: Cleanup CL FullyConnectedLayerMoritz Pflanzer
Change-Id: Ic7191be1f136c6aad4037cf2ec4bc6d7d0e440d3 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83713 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-417 - Fixed call of direct convolution 1x1 for bifrostGian Marco Iodice
Change-Id: Ic4e56e8881b8c66758e67c486514ec397cf43f8e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84592 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-477 - Optimized Direct Convolution 3x3 and 5x5 (f32) for Bifrost.Gian Marco Iodice
Each work-item computes 4x3 output elements in case of 3x3 convolution and 4x2 in case of 5x5 convolution Change-Id: I6ebbaff8b7e971c1f90d5845c0b58d2a40f39df5 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84345 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Add in-place support for batch-normalization.Georgios Pinitas
Change-Id: I4b0c9348f3bc2addc198a76fadd1b583abf42b60 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84434 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-478 Implemnt CL direct convolution 5x5steniu01
Change-Id: I4b975aff310cda9964d8c5dcee182d5d5c82741b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83474 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-474 - Add support for QS8/QS16 DirectConvolution CLMichalis Spyrou
Change-Id: I537e4acbc02c8d880ff8630ea62223e0f1a1dda3 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82875 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-424 NEON/CL Harris Corners validation tests.Giorgio Arena
Change-Id: I82d2a73f515a8d45d16b9ddb702fea51ae05c82e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79687 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-417 - Fixed bug in CLCol2ImKernek related to the stride passed ↵Gian Marco Iodice
during the configuration Change-Id: I9818f72e5ddd0d21f6700c651fc968ff61507424 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83909 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-459 Collapse CL Im2col's higher dimensionssteniu01
Change-Id: I0ccc39cbcf6926e6810faf3fe264c4af7adc3f7b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83070 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-477 - Optimizing CLDirectConvolution 3x3 on OpenCL and added the ↵Gian Marco Iodice
auto configuration Change-Id: I3c8384dcbc9d7786943134bb658dafb35356d90d Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83253 Reviewed-by: Steven Niu <steven.niu@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-415: Add autoconfigure to CLCol2ImKernelAnthony Barbier
Change-Id: I50c114d0c78d443a21bf43aa36a370474f0769ce Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82955 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-09-17COMPMID-417: Fix CLNormalization error issue.Georgios Pinitas
Change-Id: Ie538245ee0451e4cdb28120e80b9a65f56a07e7d Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82933 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-472 : Implement Floor for CL and NEON.Georgios Pinitas
Change-Id: I675a4545b1fe9ab665a07c834720bfe7ff589cee Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82527 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-443 collapse higher dimension for CL col2im kernelsteniu01
Change-Id: I99d41c7c95b8d4e3cd5c1685c68936b6a2db4192 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81885 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-417 NEON/CL MeanStdDev bugfix using FillBorderKernelGiorgio Arena
Change-Id: Ic48ba7f69783d0e1e80611264e2bc67d1732436e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81293 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-355 Implement CL DirectConvolution1x1SiCong Li
* Add FP16 to validation tests. * Complete benchmark tests for CL and NEON Direct Convolution. Change-Id: Ie73d8580832372db01b82b39786fd9c8be560090 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82014 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-438: Add support for floating point Min-Max Location layer.Michele Di Giorgio
Change-Id: I84ae564a40fc7320a6f94a84d53906ba51404f51 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79797 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-413: Add support for QS8 and QS16 CLNormalizationLayer.Michele Di Giorgio
Change-Id: I1aaa9fb8d05796bbca9cfae584e084646552bb71 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80155 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-417 - Bug Fix WarpPerspective kernelIsabella Gottardi
Change-Id: Ic26fb3b1b60c1a1f4848d683862a25bd1ebc2cc8 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82053 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-09-17COMPMID-455 - Optimizing CLIm2ColKernelGian Marco Iodice
Change-Id: Iee618948cc8f310ee9af2d786240e8120e4c6ab9 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81665 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-417: Fix F16 CLSoftmaxLayerMoritz Pflanzer
Change-Id: I231b1fcaea8bfb11f8306bc71fdde78fadeed60d Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81832 Reviewed-by: Steven Niu <steven.niu@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-355 Implement 3x3 CL direct convolutionsteniu01
Change-Id: I1b44dc375045964e65557f0ead57a7c12d6bf097 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81418 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-446: Add support for QS8/QS16 CL Arithmetic Add/SubMichele Di Giorgio
Change-Id: I84fc457a9c28856a11322944822d2fabaf92e8e4 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80528 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-425 Port CLBatchnormalization to support QS8/QS16Michalis Spyrou
Change-Id: I46c93305f377666ea0915ff789b7dfdfff596087 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78862 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-443 Collapse higher dimension for pooling layer and normalization layersteniu01
Change-Id: Icd08eefbd938c11c77dc4264af1fa3664fb336bc Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80568 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-443 Change CLSoftMaxLayerKernel to use 3D tensor and collapse the ↵steniu01
higer dimension Change-Id: I730ef45d855113d8baa7d89818441e168ea43c63 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80573 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-406: Port CLActivationLayer to use QS8/QS16.Georgios Pinitas
Change-Id: Ia4114984c38e1d2027ad97335b3c6c11f5754e23 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78727 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-417: Port DepthConcatenate to QS8/QS16 for NEON/CL.Georgios Pinitas
Change-Id: I3dddae63043c7aa18d908a4fc8abacf3c64f98ca Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80081 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-09-17COMPMID-443 Use 3D tensor for pixel multiply (Needed for Normalization Layer)Anthony Barbier
Change-Id: I117688f12334e6afc705c863acdf71b0bb1fc6e8 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80352 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-443: Collapse higher dimensions for activation layerAnthony Barbier
Change-Id: I5943235aff1bb6440e3ab08e818d53aa5d94143a Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80349 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-443: Use 3D tensors for fill_border_imageAnthony Barbier
2x performance improvement on some GoogLeNet Pooling tests Change-Id: If75336aa6308731a06462a73cd9209d24574509e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80342 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-431 Port CLDepthConvert to use 8-bit and 16-bit fixed pointsteniu01
Change-Id: Iedea9e985427e6242f34a5362615f79c0526d5bd Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79786 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-429: Port CLSoftmaxLayer to QS16.Georgios Pinitas
Change-Id: I3a0394364629654747439372d32f692b6ca29ee0 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80219 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-440, COMPMID-441 - Port CLConvolutionLayer and CLFullyConnectedLayer ↵Gian Marco Iodice
to support 16 bit fixed point Change-Id: I8d8ef2cb5ec453eb83fba8d8077550b96ed4bceb Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79837 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-409: Add support for QS8 and QS16 CLPixelWiseMultiplication.Michele Di Giorgio
Change-Id: I7f66d49d746ba9fb6e726ccab83d3a97b8ddef80 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78491 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-417: Fix assert in GEMMTransposeMoritz Pflanzer
The assert was checking the wrong thing. Only if the window over the input is smaller than the number of processed elements, the output shape would be empty. However, the valid region will be empty if the input's first dimension is less than the number of elements processed. That required the changes in TensorShape. Change-Id: I36fed7893dfd502e26c5c776c9a2d774d6cd91c6 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79813 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>