aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/kernels
AgeCommit message (Collapse)Author
2018-11-02COMPMID-685 Extend CLTuner support to other DL functionsGiorgio Arena
Change-Id: Ica97857c2145228e4a6088724681ec1c0a138133 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95918 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-661: issue# 23 Scale border fix (#26)Daniil Efremov
Changes in CL and reference in terms of border handling. Change-Id: I5bed95b1f4c308629d7113455dc8a55d74500bcd Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95742 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-676: Rework TensorInfo buildingGeorgios Pinitas
Change-Id: Ic98f64ffe30739437a1fe31ef98d83ee900741e3 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95512 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-556: Fix CLNormalization issues.Georgios Pinitas
-Extracts calculations from the CL kernel core loop. -Changes the access elements for CROSS_MAP to reduce the applied redundant padding. Change-Id: If41c3adddd977be9386fe34940d055c301ccbb91 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95917 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-678 Align Convolution InterfacesGiorgio Arena
Change-Id: I257a09860dd82e7bb7a767edf96dcaf31b512855 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95865 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-583 - Implemented reference implementation and validation tests ↵Isabella Gottardi
(NEON and CL) for Histogram Change-Id: Iccf6b4483cb8394dab2f861a737583126f9bed81 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91601 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-671: Add global pooling layer support.Georgios Pinitas
Change-Id: Iead7497cc03e1e7bde440d2965a7bf54cbfa88bf Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95579 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Joel Liang <joel.liang@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-661: softmax-uint8 implementation (#16)Chunosov
Change-Id: Iad11ce70a8a0878a48e445a092035c49c926cece Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94855 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-661: Add QAsymm8 for Reshape (#29)Daniil Efremov
Change-Id: I7a4126f96aa7ef7ed768ebe5b4e2b1f84228f8e6 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95060 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-661: Add avgpool-uint8 support. Optimize avgpool-fp32 for Bifrost. (#13)Anton Lokhmotov
Change-Id: I32ba6afbac6694ffa053dd16f03a1b3d14627a19 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94857 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-617: Add validation functions.Georgios Pinitas
Added validation routines to the following kernels. -CLActivationLayer -CLBatchNormalizationLayer -CLArithmeticAddition -CLArithmeticSubtraction -CLPixelwiseMultiplication Change-Id: I0f3a03154f9e392279f715af656683cd0ad4cef5 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94595 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-661: softmax-fp32 optimisation (#14)Chunosov
Change-Id: I2007af1ed9dcf68065cf412aa50f73a2025b31a6 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94605 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-661: directconv-uint8 (#20)Chunosov
Change-Id: I84f7a1ce3658be0d3c91e65096467258af48f0b6 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94341 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-556: Support beta for all softmax data types.Georgios Pinitas
Change-Id: I4c0ca033dc53829fb7ac3dd7c7469d143be74e73 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94251 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-617: Adds validation to CLPoolingLayerGeorgios Pinitas
Change-Id: Ied405a9c0e9746598d03ac6a944ad87e9b6494eb Reviewed-on: http://mpd-gerrit.cambridge.arm.com/93680 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-556: Rework CLActivationLayerGeorgios Pinitas
Refactoring. Change-Id: I879353299b655ec3026cccdfcfca2ee98abf14ea Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94191 Reviewed-by: Michel Iwaniec <michel.iwaniec@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-647: Exclude padding pixels from averaging factor.Georgios Pinitas
Adds support for excluding the padding pixels from the average scaling factor calculation. Change-Id: Ia13fbfeae235aff564db74191613921848231a01 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/93715 Reviewed-by: Robert Hughes <robert.hughes@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02IVGCVSW-619: Support for Cl u8 bounded ReluMichel Iwaniec
Change-Id: I3c39ecbd36f06d5376c35ed4eb38dd73533ef97e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/93686 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-556: Autoconfigure support in CLDepthwiseConvolution.Georgios Pinitas
Change-Id: I697d3237b39d0f088b820c14b65cfcbbd2e26e09 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/93412 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02IVGCVSW-657 : fix asymmetric padding for 3x3 depthwise convJaroslaw Rzepecki
Change-Id: Ied6b3c41d988b9ff6a93f938117dc29ad4c85e9f Reviewed-on: http://mpd-gerrit.cambridge.arm.com/93421 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-643: Add bias to CLDepthwiseConvolution.Georgios Pinitas
Change-Id: Ibfe7b8c1172d10cbcae7971fe86b82090519d31d Reviewed-on: http://mpd-gerrit.cambridge.arm.com/92798 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Jaroslaw Rzepecki <jaroslaw.rzepecki@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-556: Fix CLPoolingLayer checksGeorgios Pinitas
Change-Id: Ib76554adf00fb3c1943da634dc948089843f0e78 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/92439 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02IVGCVSW-601: support for asymetric padding in cl conv and depthwise convJaroslaw Rzepecki
Change-Id: I5c6c95091ae77dba96459c0640f9f6167a988c8c Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91700 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02IVGCVSW-632 CL support for Softmax beta parameterPablo Palmier
Change-Id: I21da48d2f40aa900301235eaced54b7eb644b0b2 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91307 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-569 - Implemented reference implementation and validation tests ↵Isabella Gottardi
(NEON and CL) for Median3x3 Change-Id: I7028f0bcc4a502261210f536ed604a7651ab6726 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91332 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-618: Fix mismatches in CLDepthConvert OclgrindGeorgios Pinitas
Out-of-bounds float to integer conversion is implementation defined. Oclgrind converts to S32 and truncated while GPU converts S32 and clamps. We force to always SATURATE for float to int conversion. Change-Id: I82be9e8cdcc49b32adb8c0da064542b63f891666 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91512 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-554 Add NodesMichalis Spyrou
- BatchNormalization - L2Normalize - Floor Change-Id: I03e06dea30e956f56a86f9c5642cd609c6696ad2 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91364 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-417 - Added validation for FP16 CLBatchNormalizationLayerGian Marco Iodice
Change-Id: Icc6194a311af0e96978e6be2cc4c5da9d7fb0bcc Reviewed-on: http://mpd-gerrit.cambridge.arm.com/89493 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-11-02COMPMID-417: Fix border and window in CLGEMMMatrixVectorMultiplyKernelGeorgios Pinitas
Change-Id: I2eacba2c87bce84b7f6b69a734ff775473f990bc Reviewed-on: http://mpd-gerrit.cambridge.arm.com/89401 Reviewed-by: Steven Niu <steven.niu@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-424 Implemented reference implementation and tests for WarpAffineIsabella Gottardi
Change-Id: I4924ab1de17adc3b880a5cc22f2497abbc8e221b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85820 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-11-02COMPIMID-523: Fix CLDepthwiseConvolution test.Georgios Pinitas
The specified output size of the failing test case was invalid. Additionally the kernel has been cleaned up and asserts have been added in case of invalid configurations. Change-Id: I198f3574f003b71968e4081a54cf102d748af5c1 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/88821 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-524 - Implemented CLTuner objectGian Marco
Change-Id: Idbdbecca1fc299ed042936119d90e2bed8db0938 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87101 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-541: Fix padding in CLMinMaxLocationKernelMoritz Pflanzer
Change-Id: Ie17e3f14c428553d433da2a564e016bfac7749a9 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/88881 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-417 fix the depthwise conv bugsteniu01
Change-Id: Ica3c26d09f8009240467e0d3a12f585170fbcd44 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/88677 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-515: L2 Pooling for FP32/FP16 in CL.Georgios Pinitas
Change-Id: I43641fa672f5905ca62edd1f63fc93e0cf7ea382 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85963 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-516 Increase tolerance rate of Scale, Conv, fully connected and GEMMsteniu01
This patch also fix the scale kernel issue where it was calcuated the scale factor inside the gpu but now in the CPU. The GPU and CPU gave different result for simple float division operation Change-Id: Ib6709cb6c41dcf4fc0fa4eb79e481430695bf40e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87266 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-452 CL Generic Depthwise Convolution implementation.Giorgio Arena
Change-Id: I115e48fe6ce5e281f3791aa5d80fdc754cdd2b5e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85082 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-522 - Added support for GlobalPooling in CLPoolingLayer and ↵Gian Marco Iodice
CLFlattening for 3D tensor Change-Id: Ifc7db1e4d4af322a4dcbfeb3e132e5c326596872 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86618 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Add support for floats in scale.Georgios Pinitas
Change-Id: I7d714ba13861509080a89817f54e9d32da83e970 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86026 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-462: Implement TensorReshape for NEON and CL.Georgios Pinitas
Change-Id: I11b39c2ceca26ade73822e29a384ef866ae05729 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87707 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417 Fix reduction kernel's __local buffer sizeMichalis Spyrou
Change-Id: If97a79d86b174b1d9b41360303d624e3b2d22001 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87703 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-448: Implement CL Quantization/Dequantization Layer.Michele Di Giorgio
Change-Id: Id002e23a2ac48af3d245416dc6411d9a04a1e513 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81827 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-477 - Optimized CLDirectConvolution1x1 for BifrostGian Marco Iodice
- Fixed bug in CLDirectConvolution3x3 Change-Id: Iaf34ef44f0b7bc02e66f3eb4452ff7a90ef83523 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86725 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-514 (3RDPARTY_UPDATE)(DATA_UPDATE) Add support to load .npy dataSiCong Li
* Add tensorflow_data_extractor script. * Incorporate 3rdparty npy reader libnpy. * Port AlexNet system test to validation_new. * Port LeNet5 system test to validation_new. * Update 3rdparty/ and data/ submodules. Change-Id: I156d060fe9185cd8db810b34bf524cbf5cb34f61 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84914 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-415 - Fixed bug in CLDepthConcatenateKernelGian Marco Iodice
Change-Id: Ieedb714cb3666504c175aa488505e0485778c589 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86705 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-424 Implemented reference implementation, new output valid region ↵Isabella Gottardi
and validation tests (NEON and CL) for Scale Change-Id: I056fa3588b807a97cacf0b8afaec56e37ffc92af Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83872 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-476 L2 Normalization for CLMichalis Spyrou
Change-Id: I88f87173645880eb823916c5d4ac884c372a4fb4 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83269 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Fix invalid read in CL GEMM accumulate biasesMoritz Pflanzer
Change-Id: Ie7786a29faa0d98d8ad65c2333d0d6a1665340bc Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85635 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-358 Implement OpenCL ROI PoolingSiCong Li
* Implement OpenCL ROI Pooling * Add CLROIPoolingLayer benchmarks Change-Id: I8786d01d551850a1b4d599a48fabe3925e0a27d0 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79833 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-477 - Optimized batched case in CLConvolutionLayerGian Marco Iodice
Change-Id: I4ef18f49f1da0cb816aaa0762466b940792c15ed Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84162 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>