ComputeLibrary.git -

Age	Commit message (Collapse)	Author
2019-01-18	COMPMID-1687: Optimize CLGEMMMatrixMultiplyKernel	Gian Marco Iodice
	Change-Id: I040478ff7aa04f0523ed6e302129b829442cb194 Reviewed-on: https://review.mlplatform.org/534 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2019-01-15	COMPMID-1724: CL Implement Prod fix	Manuel Bottini
	Change-Id: I9cf07afe6198e3364ede06faaa9a09a782a34792 Reviewed-on: https://review.mlplatform.org/519 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2019-01-14	COMPMID-1772: Implement PadV2 for NEON	Georgios Pinitas
	Change-Id: Ia4604524a034c46b004fd850183480c5fbfd8cb3 Reviewed-on: https://review.mlplatform.org/437 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2019-01-14	COMPMID-1724: CL Implement Prod	Manuel Bottini
	Change-Id: I17e51f25064b53a8f7e13d6fcbecc14a192de103 Reviewed-on: https://review.mlplatform.org/387 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-14	Issue COMPMID-1835: Remove CLGEMMInterleave4x4Kernel and replace with ↵	giuros01
	CLGEMMReshapeLHSMatrixKernel Change-Id: Id6a1bd78f9b1698b64a004e4adebc41002b15745 Reviewed-on: https://review.mlplatform.org/496 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2019-01-11	COMPMID-1677: Change ROIPooling layer interface to accept ROIs as tensors	Manuel Bottini
	Change-Id: If16b572a4d906187b77f32133a72a44316fa74e4 Reviewed-on: https://review.mlplatform.org/490 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2019-01-11	COMPMID-1761: NEON: Implement Pack	Isabella Gottardi
	Change-Id: Icc3392494b1e3361e8fd925da200827c494351b3 Reviewed-on: https://review.mlplatform.org/430 Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2019-01-08	COMPMID-1865 NEReduceMean fails on shape validation	Michalis Spyrou
	Also handle negative axis Change-Id: I28e48702d926c2f4aea7b1b674b51bebb01ce5f8 Reviewed-on: https://review.mlplatform.org/464 Reviewed-by: Matthew Bentham <matthew.bentham@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-07	COMPMID-1727 - CL: Implement Gather	Manuel Bottini
	Change-Id: I3d859da09a4de1019bb8c2046725eab942247927 Reviewed-on: https://review.mlplatform.org/386 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-28	COMPMID-1860: Invalid arguments in CLDepthwiseConvolution3x3 for NHWC	Georgios Pinitas
	-Alters the kernel/function selection process to use validate for selection. -Fixes border kernel input in case of permutation. Change-Id: Ia61df3a0ed661349114dc125f33ad53ee40d9c76 Reviewed-on: https://review.mlplatform.org/443 Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-27	COMPMID-1710: Fixed unused function warning in CLUnstack	Georgios Pinitas
	Change-Id: I94ef19271b059fafb7dad26fee0e229d7e65f64e Reviewed-on: https://review.mlplatform.org/441 Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com>
2018-12-21	COMPMID-1726: Implement CLUnstack.	Pablo Tello
	Change-Id: I94b0707d19757c5f5d7ca66d9c47e378867126a3 Reviewed-on: https://review.mlplatform.org/325 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-21	COMPMID-1836: Remove CLGEMMTranspose1xWKernel and replace with ↵	giuros01
	CLGEMMReshapeRHSMatrixKernel Change-Id: Ic5a4f32657a155380684dcd4b44fbb608ef40cb4 Reviewed-on: https://review.mlplatform.org/418 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-18	COMPMID-1722 : CL: Implement Range	Vidhya Sudhan Loganathan
	Change-Id: I88da6eb5289c303b1dc91606c1560ce629746058 Reviewed-on: https://review.mlplatform.org/381 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-17	COMPMID-1812: CLSpaceToBatch paddings not calculated correctly	Isabella Gottardi
	Change-Id: I63fed6799c4ed2848ff80cd7458124692a52bb98 Reviewed-on: https://review.mlplatform.org/400 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-12-14	COMPMID-1710: Fixes in StrideSlice calculations.	Georgios Pinitas
	Change-Id: I66eb922f1ff15142de278bf4439a61c979f98ba7 Reviewed-on: https://review.mlplatform.org/382 Reviewed-by: Matthew Bentham <matthew.bentham@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
2018-12-14	COMPMID-1687: Optimize CLGEMMMatrixMultiplyKernel for Mali-G76 - Part1	Gian Marco Iodice
	The current implementation is limited just to FP32 Change-Id: I185ab57e483e879d7c301e9cc3033efc8b41e244 Reviewed-on: https://review.mlplatform.org/389 Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-13	COMPMID-1071: (3RDPARTY_UPDATE) Add depth multiplier on DepthwiseConv 3x3 NHWC	Georgios Pinitas
	Change-Id: I316ff40dda379d4b84fac5d63f0c56efbacbc2b4 Reviewed-on: https://review.mlplatform.org/371 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-05	COMPMID-1719 CL: Implement RSqrt, Exp	Michalis Spyrou
	Change-Id: I827b26239043a9e90d26c2583122648d2a45303a Reviewed-on: https://review.mlplatform.org/317 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-05	COMPMID-1723: CL: Implement Reverse	Michele Di Giorgio
	Change-Id: Id0d4a07af24e2331161996083b0c1bab072bd405 Reviewed-on: https://review.mlplatform.org/322 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-05	COMPMID-1298: Fuse ReLu activation in CLWinogradOutputTransform	Manuel Bottini
	Change-Id: I9e6e43a5839d04c2e4b4552c05446efb0a5074cf Reviewed-on: https://review.mlplatform.org/232 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-05	COMPMID-1073: CLDepthwiseConvolutionLayer uses the optimised path	Pablo Tello
	Change-Id: Ibdb7d875f8ff89bc210c63d389abef1ea1fd51d5 Reviewed-on: https://review.mlplatform.org/330 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com>
2018-12-05	COMPMID-1725: Implement Pack	Gian Marco Iodice
	Change-Id: I13f6e4c600f39355f69e015409bf30dafdc5e3aa Reviewed-on: https://review.mlplatform.org/332 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-11-30	COMPMID-1717: CL: Implement Maximum, Minimum, SquaredDifference	giuros01
	Change-Id: Ice653e48211053bd3cd20a693bd76de6b4efc370 Reviewed-on: https://review.mlplatform.org/270 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-11-30	COMPMID-1728 CL: Implement ArgMax/ArgMin	Michalis Spyrou
	Change-Id: I7eae2e55cc0b0b7bbebb7617299daaca6f75f40c Reviewed-on: https://review.mlplatform.org/292 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-28	COMPMID-1716: CL Comparison operations	Georgios Pinitas
	Adds support for Equal,NotEqual,Less,LessEqual,Greater,GreaterEqual Change-Id: If0cdf4aae7f95c94709b195eee485f6663f45909
2018-11-27	COMPMID-1720: CL: Implement Tile	giuros01
	Change-Id: I2a18f0acea382960a8bc71a8f56928a5998f0dd6
2018-11-23	COMPMID-1734: Implement CLSelect	Georgios Pinitas
	Change-Id: I49b2e8b4200c9ed654736d9451e4ab9c073b4b10
2018-11-22	COMPMID-1645 NEL2Normalization for FP32/FP16 & NHWC	Michalis Spyrou
	Change-Id: I29e35024e29781a6b943b568abec9c73649215e6
2018-11-22	COMPMID-1718: Extend DepthConvert to support Cast	Georgios Pinitas
	Change-Id: I6ee2c0b670727fc808fa636c53ddfaec3a0036c9
2018-11-21	COMPMID-1088: Use IMemoryRegion in interfaces where possible	Georgios Pinitas
	-Simplifies import memory interface -Changes the used of void** handles with appropriate interfaces. Change-Id: I5918c855c11f46352058864623336b352162a4b7
2018-11-19	COMPMID-1065 : Create documentation explaining how to add new functions / ↵	Vidhya Sudhan Loganathan
	kernels Change-Id: I98183f95814442b6f3dbb67a1bdae99df05b9b01
2018-11-16	COMPMID-1451: (3RDPARTY_UPDATE) Fixes for GenerateProposals graph node and ↵	Michele Di Giorgio
	BoxWithNMSLimitKernel COMPMID-1792: Accuracy issue in CLGenerateProposals This patch does the following: - Some fixes for GenerateProposals function and tests - Adapting BoxWithNMSLimitKernel to only accept U32 tensors as keeps_size - Update 3rdparty - Adds a small tolerance for a GenerateProposals test Change-Id: Ia8ec1cdfe941fe05003645e86deb9ea6a6044d74
2018-11-16	COMPMID-1266 : Add support for FP16 in CLWinogradConvolutionLayer: 5x5 kernels	Vidhya Sudhan Loganathan
	Introduced F32 accumulation for F16 winograd gemm and output transform WinogradConvolution will be available for F16 only if fast math flag is enabled Change-Id: I215593c205236a0f9669218437bb40b184ec6a4f
2018-11-15	COMPMID-1787: Change the heuristic selection in CLGEMMLowpMatrixMultiplyCore	Gian Marco Iodice
	Change-Id: Ia8d4e46ce5d9bb366af15726bc208dc14583c6ae
2018-11-15	COMPMID-1676: Change CLROIAlign interface to accept ROIs as tensors	Manuel Bottini
	Change-Id: I69e995973597ba3927d29e4f6ed5438560e53d77
2018-11-15	COMPMID-1451: Fix the shape of scratch_buffer in case of CIFG	Georgios Pinitas
	In case of CIFG optimisation scratch buffer should have a size of [batch_size, num_units * 3] else [batch_size, num_units * 4]. Change-Id: I43e46f7b52e791472f1196f36e9142240ba76c5c
2018-11-15	COMPMID-1329: Add support for GenerateProposals operator in CL	giuros01
	Change-Id: Ib0798cc17496b7817f5b5769b25d98913a33a69d
2018-11-14	COMPMID-1462 SSD support: Create CL PriorBox	Michalis Spyrou
	Change-Id: I5bf5d751ec7c02d96c26a769f49d03ea23a248b7
2018-11-13	COMPMID-1707: Create 3 special CLWidthConcatenate kernel to concatenate 2/4 ↵	Michele Di Giorgio
	and 8 tensors (Part 1) Creating special cases for concatening 2 and 4 tensors. Change-Id: I6a739a494ae45011acb65369e353f9ef96970b90
2018-11-12	COMPMID-1451: Set axis correctly in CLL2Normalize validate function	Georgios Pinitas
	Change-Id: I93b14106cda8a1f640cf5acf120d31e2ebdaf495
2018-11-08	COMPMID-1451: Fix fused activation in GEMMConvolutionLayer	Georgios Pinitas
	-Uses output quantization information for the activation layer. -Updates checks for BoundedRelu at CL side. Change-Id: I0447860e90f1c89b67b9ace3c8daad713f6c64e0
2018-11-08	COMPMID-1451: Allow weights retention in CLDeconvolutionLayer	Michele Di Giorgio
	Change-Id: I953f3b63aa4910650a1a3f6faea31beb4f6f376a
2018-11-08	COMPMID-1451: Removed output_depth3d from ↵	Gian Marco Iodice
	CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat Since we perform an element-wise operation, it is not necessary to pass the output_depth3d. Change-Id: Ibfa07a0706e902acf59b444aa61e18a348162ea9
2018-11-08	COMPMID-1736: Fixed out-of-bound write in CLIm2Col	Gian Marco Iodice
	The issue was related to CLIm2Col when the number of input channels was less than the number of elements processed by each thread. The bug has been fixed in the validate_and_configure_window() function setting the correct number of elements accessed in the output tensor. Also fixed an issue GEMM3D when we have a single output channel Change-Id: I094292d0c7662599c4a4c3916ec5f5821df5faef
2018-11-06	COMPMID-1451: Fix order of allocations in CLLSTMLayer	Michele Di Giorgio
	ArmNN reported an issue with padding in CLLSTMLayer. This was due to the fact that some tensors were allocated before they were passed to some configure functions which attempted to change the padding requirement on already allocated memory. Also, increase tolerance on number of mismatches for CLBBoxTransform FP16. Change-Id: Iad75b012be895693d0e553f3ab85f1ca7144e882
2018-11-02	COMPMID-1413 - Improve the performance of GEMMLowp with 8 bit dot product on ↵	Gian Marco Iodice
	OpenCL COMPMID-1424 - Add dot product support for CLDepthwise QASYMM8 3x3 NHWC non-unit stride With this patch we are able to improve the performance of MobileNet v1-qasymm8 by 37 % Tried to use the dot product instruction in CLDepthwise QASYMM8 3x3 NHWC non-unit stride but I have not seen any benefit (maybe because we have few arithemtic operation and we do not have more load instructions). However Depthwise convolution has been improved by 30% Change-Id: Id768a99c2e53a04276707e427af5d0ec93419ada Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155082 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02	COMPMID-1451: Fix validation issue in CLReduceMean	Michalis Spyrou
	Change-Id: Ie1bcdd9dca2dc3b26003790a19cc80bb953385b2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155373 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02	COMPMID-1451 Properly remove dimensions in CLReduceMean	Michalis Spyrou
	Change-Id: I7bd4a8ce81483ba56686b765ca3caabebe42882d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155000 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02	COMPMID-1451: Perform CLOutputStage using floats.	Georgios Pinitas
	Change-Id: Ic8312a5b6790aa7cd4468d42f08d557ad40e9441 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154570 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>