ComputeLibrary.git -

Age	Commit message (Collapse)	Author
2018-12-19	COMPMID-1710: Improve test coverage for CLGEMMMatrixMultiplyReshapedKernel	Gian Marco Iodice
	Added test for: 1) Fp16 2) GEMM3D Change-Id: I17c03fe04fe49fba71685d33a6fd8572c91e1a56 Reviewed-on: https://review.mlplatform.org/416 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-18	COMPMID-1722 : CL: Implement Range	Vidhya Sudhan Loganathan
	Change-Id: I88da6eb5289c303b1dc91606c1560ce629746058 Reviewed-on: https://review.mlplatform.org/381 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-17	COMPMID-1812: CLSpaceToBatch paddings not calculated correctly	Isabella Gottardi
	Change-Id: I63fed6799c4ed2848ff80cd7458124692a52bb98 Reviewed-on: https://review.mlplatform.org/400 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-12-14	COMPMID-1687: Optimize CLGEMMMatrixMultiplyKernel for Mali-G76 - Part1	Gian Marco Iodice
	The current implementation is limited just to FP32 Change-Id: I185ab57e483e879d7c301e9cc3033efc8b41e244 Reviewed-on: https://review.mlplatform.org/389 Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-13	COMPMID-1071: (3RDPARTY_UPDATE) Add depth multiplier on DepthwiseConv 3x3 NHWC	Georgios Pinitas
	Change-Id: I316ff40dda379d4b84fac5d63f0c56efbacbc2b4 Reviewed-on: https://review.mlplatform.org/371 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-12	COMPMID-1697: NEPermute extended support for more cases.	Pablo Tello
	Regardless the input data layout NEPermute function has been added support for the all the permutations of 4d tensors Added corresponding validation tests. Change-Id: I0f8f20c2c3716e908a18a59783be53efab80ef5b Reviewed-on: https://review.mlplatform.org/367 Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-11	COMPMID-1775: Implement CLGEMMReshapeRHSMatrixKernel to reshape the RHS ↵	Gian Marco Iodice
	matrix of GEMM/GEMMLowp Change-Id: I77f2bfcc5d170bcc2428a2f27104942c1ec877d7 Reviewed-on: https://review.mlplatform.org/375 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-10	COMPMID-1774: Implement CLGEMMReshapeLHSMatrixKernel to reshape the LHS ↵	Gian Marco Iodice
	matrix of GEMM/GEMMLowp Change-Id: I8c5fd4c8bcdffda1522c83158981ed92baa045f4 Reviewed-on: https://review.mlplatform.org/364 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-05	COMPMID-1822: (Nightly) : 'CL/ArithmeticDivision mismatches	giuros01
	Change-Id: I14cea30ffa9ca735941b559bb272b8c476814a34 Reviewed-on: https://review.mlplatform.org/338 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-12-05	COMPMID-1719 CL: Implement RSqrt, Exp	Michalis Spyrou
	Change-Id: I827b26239043a9e90d26c2583122648d2a45303a Reviewed-on: https://review.mlplatform.org/317 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-05	COMPMID-1723: CL: Implement Reverse	Michele Di Giorgio
	Change-Id: Id0d4a07af24e2331161996083b0c1bab072bd405 Reviewed-on: https://review.mlplatform.org/322 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-05	COMPMID-1298: Fuse ReLu activation in CLWinogradOutputTransform	Manuel Bottini
	Change-Id: I9e6e43a5839d04c2e4b4552c05446efb0a5074cf Reviewed-on: https://review.mlplatform.org/232 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-05	COMPMID-1073: CLDepthwiseConvolutionLayer uses the optimised path	Pablo Tello
	Change-Id: Ibdb7d875f8ff89bc210c63d389abef1ea1fd51d5 Reviewed-on: https://review.mlplatform.org/330 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com>
2018-12-05	COMPMID-1725: Implement Pack	Gian Marco Iodice
	Change-Id: I13f6e4c600f39355f69e015409bf30dafdc5e3aa Reviewed-on: https://review.mlplatform.org/332 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-04	COMPMID-1820: (Nightly) NEON/DepthConvertLayer/F16_to_F32 fails	Georgios Pinitas
	-Removes shift from depth conversion tests. -Changes Cast tolerance between float conversions to zero Change-Id: I6c456f7d910eb3c02069f1e4d5df7b257d6d784e Reviewed-on: https://review.mlplatform.org/341 Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-11-30	COMPMID-1717: CL: Implement Maximum, Minimum, SquaredDifference	giuros01
	Change-Id: Ice653e48211053bd3cd20a693bd76de6b4efc370 Reviewed-on: https://review.mlplatform.org/270 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-11-30	COMPMID-1728 CL: Implement ArgMax/ArgMin	Michalis Spyrou
	Change-Id: I7eae2e55cc0b0b7bbebb7617299daaca6f75f40c Reviewed-on: https://review.mlplatform.org/292 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-28	COMPMID-1716: CL Comparison operations	Georgios Pinitas
	Adds support for Equal,NotEqual,Less,LessEqual,Greater,GreaterEqual Change-Id: If0cdf4aae7f95c94709b195eee485f6663f45909
2018-11-27	COMPMID-1720: CL: Implement Tile	giuros01
	Change-Id: I2a18f0acea382960a8bc71a8f56928a5998f0dd6
2018-11-23	COMPMID-1734: Implement CLSelect	Georgios Pinitas
	Change-Id: I49b2e8b4200c9ed654736d9451e4ab9c073b4b10
2018-11-22	COMPMID-1645 NEL2Normalization for FP32/FP16 & NHWC	Michalis Spyrou
	Change-Id: I29e35024e29781a6b943b568abec9c73649215e6
2018-11-22	COMPMID-1718: Extend DepthConvert to support Cast	Georgios Pinitas
	Change-Id: I6ee2c0b670727fc808fa636c53ddfaec3a0036c9
2018-11-22	COMPMID-1648: CLNormalizationLayer IN_MAP_2D support for NHWC for FP32/FP16	Michele Di Giorgio
	Change-Id: I49f1d865f5e7562f1d80db849353a89ef77e6a9e
2018-11-21	COMPMID-1451 (Nightly) L2Normalization sigkill	Michalis Spyrou
	NHWC reduction on 0 axis requires a lot of memory. Testing only axis 1 and 2 for now. Change-Id: I82e16a27b6dfc6b426e6294cde63c3d88cb41a09
2018-11-21	COMPMID-1088: Use IMemoryRegion in interfaces where possible	Georgios Pinitas
	-Simplifies import memory interface -Changes the used of void** handles with appropriate interfaces. Change-Id: I5918c855c11f46352058864623336b352162a4b7
2018-11-20	COMPMID-1451: Fix CLBatchToSpace static validation method	Michalis Spyrou
	Change-Id: I770b044b67d93510ef65e556905135b34be7ea0a
2018-11-19	COMPMID-1065 : Create documentation explaining how to add new functions / ↵	Vidhya Sudhan Loganathan
	kernels Change-Id: I98183f95814442b6f3dbb67a1bdae99df05b9b01
2018-11-16	COMPMID-1451: Fixes for BoundingBoxTransform	giuros01
	- Fixing a bug for which we did not scale the boxes before transforming them - Adding the correct_transform_coords option to BoundingBoxTransformInfo Change-Id: I40281254bcf87e7c8583c119e99562414fe59822
2018-11-16	COMPMID-1451: (3RDPARTY_UPDATE) Fixes for GenerateProposals graph node and ↵	Michele Di Giorgio
	BoxWithNMSLimitKernel COMPMID-1792: Accuracy issue in CLGenerateProposals This patch does the following: - Some fixes for GenerateProposals function and tests - Adapting BoxWithNMSLimitKernel to only accept U32 tensors as keeps_size - Update 3rdparty - Adds a small tolerance for a GenerateProposals test Change-Id: Ia8ec1cdfe941fe05003645e86deb9ea6a6044d74
2018-11-16	COMPMID-1266 : Add support for FP16 in CLWinogradConvolutionLayer: 5x5 kernels	Vidhya Sudhan Loganathan
	Introduced F32 accumulation for F16 winograd gemm and output transform WinogradConvolution will be available for F16 only if fast math flag is enabled Change-Id: I215593c205236a0f9669218437bb40b184ec6a4f
2018-11-16	COMPMID-1793 CLL2Normalization mismatches	Michalis Spyrou
	Increase tolerance for FP16 Change-Id: I88f95da5471bbceb7449f453e2e33cf0bc4da23e
2018-11-15	COMPMID-1676: Change CLROIAlign interface to accept ROIs as tensors	Manuel Bottini
	Change-Id: I69e995973597ba3927d29e4f6ed5438560e53d77
2018-11-15	COMPMID-1708: Improve GEMM test coverage.	Pablo Tello
	Added test cases to exercise the code path where the reshaping of B is performed on the fly. Change-Id: Ifa4348e1054dc0019be3927f482adf64b18fd554
2018-11-15	COMPMID-1329: Add support for GenerateProposals operator in CL	giuros01
	Change-Id: Ib0798cc17496b7817f5b5769b25d98913a33a69d
2018-11-14	COMPMID-1462 SSD support: Create CL PriorBox	Michalis Spyrou
	Change-Id: I5bf5d751ec7c02d96c26a769f49d03ea23a248b7
2018-11-14	COMPMID-1781 Add channel support in CLL2Normalization	Michalis Spyrou
	Change-Id: Ibab049f09413258c99335b7da6b151530a1bd136
2018-11-08	COMPMID-1736: Fixed out-of-bound write in CLIm2Col	Gian Marco Iodice
	The issue was related to CLIm2Col when the number of input channels was less than the number of elements processed by each thread. The bug has been fixed in the validate_and_configure_window() function setting the correct number of elements accessed in the output tensor. Also fixed an issue GEMM3D when we have a single output channel Change-Id: I094292d0c7662599c4a4c3916ec5f5821df5faef
2018-11-08	COMPMID-1776: Revert QuantizeDownStage to use fixed-point	Georgios Pinitas
	Change-Id: I807ef84dbf893bd401dcac5c0fa3a4ee49aabc66
2018-11-06	COMPMID-1451: Fix order of allocations in CLLSTMLayer	Michele Di Giorgio
	ArmNN reported an issue with padding in CLLSTMLayer. This was due to the fact that some tensors were allocated before they were passed to some configure functions which attempted to change the padding requirement on already allocated memory. Also, increase tolerance on number of mismatches for CLBBoxTransform FP16. Change-Id: Iad75b012be895693d0e553f3ab85f1ca7144e882
2018-11-02	COMPMID-1451 Reduce precommit tests	Michalis Spyrou
	Reduce the amount of precommit tests run in DirectConvolution, Deconvolution and Pooling. Proper investigation scheduled for later. Change-Id: Idc2510cf6877e7a605cead84f384852b609e3216 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156466 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
2018-11-02	COMPMID-1712 CLPoolingLayer wrong results in QASYMM8	Michalis Spyrou
	Also added the test case reported by ArmNN. Change-Id: I9fe9a1b4f74267a3346529f3a597b37486593c4a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155914 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02	COMPMID-1413 - Improve the performance of GEMMLowp with 8 bit dot product on ↵	Gian Marco Iodice
	OpenCL COMPMID-1424 - Add dot product support for CLDepthwise QASYMM8 3x3 NHWC non-unit stride With this patch we are able to improve the performance of MobileNet v1-qasymm8 by 37 % Tried to use the dot product instruction in CLDepthwise QASYMM8 3x3 NHWC non-unit stride but I have not seen any benefit (maybe because we have few arithemtic operation and we do not have more load instructions). However Depthwise convolution has been improved by 30% Change-Id: Id768a99c2e53a04276707e427af5d0ec93419ada Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155082 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02	COMPMID-1696: (Nighlty) CLDepthwiseConvolution FP16 mismatches	Georgios Pinitas
	Increases relative tolerance slightly as error was quite small. Change-Id: I4789c5e3eeb4f2d3aaf2b4c76966474f045af4c1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155418 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02	COMPMID-1680: (Nighlty) CLBBoxTransform mismatches	giuros01
	Instead of changing the tolerances I increased the sizes of the input. In this way, for a single mismatch, as it was the case, we are below the 1% tolerance set. Change-Id: I787261a1d1adb559c1687b7bd1e0317a72594130 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155168 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02	COMPMID-1451 Properly remove dimensions in CLReduceMean	Michalis Spyrou
	Change-Id: I7bd4a8ce81483ba56686b765ca3caabebe42882d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155000 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02	COMPMID-1673: Collapse window in CLArithmeticAddition when one operand is a ↵	Michele Di Giorgio
	vector When one of the operands is a vector, the kernel does a broadcast addition and the window is not collapsed. This represent an issue because it leads to a lot of enqueues that increases the time taken by the OpenCL driver. This patch allows to collapse the window when one of the two operands is a vector. Furthermore, it adds LWS tuner to the kernel. It also changes the number of elements processed per iteration to 8 to make better usage of the cache. Change-Id: I5f09ab0ddcffb3b7f9326a987c79a997b2d7fa8c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155003 Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02	COMPMID-1451: Perform CLOutputStage using floats.	Georgios Pinitas
	Change-Id: Ic8312a5b6790aa7cd4468d42f08d557ad40e9441 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154570 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02	COMPMID-1327: Add support for BBoxTransform operator in CL	giuros01
	Change-Id: I91865506166951b3bf7f06a0b2d4cde925cfefb6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153447 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02	COMPMID-1632 Add CLL2NormalizationLayer for NHWC and FP32	Michalis Spyrou
	Change-Id: Iae22554d5fe893fd22a000eab5bfd8275ea06eb3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154102 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02	COMPMID-1523: Fuse BN node with convolution.	Georgios Pinitas
	Change-Id: I146936c9e98b343496a4b61cdbadf0eaa38e885a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154008 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>