Age | Commit message (Collapse) | Author |
|
Partially resolves : COMPMID-3793
Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: Id82e00c784f0a039017fd896f11671bdda2dd4ab
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5530
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Execution window along the X axis needs to be collapsed on the 3rd axis (rather than the 2nd) since there could be implicit padding added along the Y
Resolve COMPMID-4425
Change-Id: I9623a31749b737fea7c623cabdcfbf77cbe8f6dc
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5551
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
All data type and data layout information for the operators are store in the function header files
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I30b564f7eda6bbd99bf3ad36ddb6639ac118eb8b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/319829
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5531
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Simplify the implementation when the pooling size has the same spatial
dimensions of the input tensor
- Rework the heuristic for F32/F16
- Add test for validating the global pooling path
- Fix compare_dimensions in validation. The validation fails because
we have different
number of dimensions for NCHW and NHWC (e.g. 1,1,2,1(NCHW) ->
2,1,1,1(NHWC)
Resolves COMPMID-4426
Change-Id: Ia53ee659a9fbc3d011f286a8150d1be9d6d2cd05
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5533
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
It doesn't take into consideration the padding.
Change-Id: Ie7a2a0c7aee93fd7007deb6e46f6bec0e7f06066
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5529
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change the parallel implementation across the X, now every thread computes one row
Add missing test for MEAN_SUM
Make reduction on any axis != 0 work with num_channels > 1
Resolve COMPMID-3917
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ib0f99540104e3c253bcd1ea637833db533f5e76e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5522
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I5c440f4c6ca4186adcfa926e6b7d924086671f29
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5520
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Correctly calculate the filter size when we
exclude padding.
Resolves: COMPMID-4408
Change-Id: Ia96d6ddbda58d23a5a0e02854ecb22089d538f94
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5523
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
|
|
Oclgrind flags this as a warning.
Resolves: COMPMID-4338
Change-Id: Id5a075894027d867bdd21937626f052acb05b531
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5512
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Simplify the implementation when the pooling size has the same spatial
dimensions of the input tensor
- Rework the heuristic for F32/F16
- Add test for validating the global pooling path
- Fix compare_dimensions in validation. The validation fails because we have different
number of dimensions for NCHW and NHWC (e.g. 1,1,2,1(NCHW) -> 2,1,1,1(NHWC)
Change-Id: Iba680cb30bf2a5d0952265a4cc9794f368549ca5
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5510
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Resolves COMPMID-4400
Change-Id: I54c33a017c735194fbf4437d1c7df465208bc0ca
Signed-off-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5505
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
|
|
Fixing Conv5x5, Conv5x1, Conv1x5
Resolves: COMPMID-4380
Change-Id: I5206d9b85b1d73f6010f02c119aae91266395ba7
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5485
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4396
Change-Id: I9b16791f84d60bc4a5303a6393cdbe9db3a4f0e9
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5483
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
|
|
Change-Id: Ibab2095a1a5b525c8513f924cd5bcecd5172ed48
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5467
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Remove the reshaped variant for CLDepthwiseConvolutionLayer 3x3 NHWC Quantized
- Remove kernel selection by GPUTarget
- Remove unused quantized support from the NHWC kernel
- Remove CLDepthwiseConvolutionLayerReshapeWeightsKernel
- Remove OpenCL kernels for reshaped dwc 3x3 quantized and weights reshape
- Remove the "_bifrost" suffix in common OpenCL kernel
- Remove the ICLDepthwiseConvolutionLayer3x3Kernel common interface
Resolve COMPMID-3864, COMPMID-3907
Change-Id: Icfac0fb6c00e214985beb05dad7c0cdbbee7d830
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5447
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Change kernel's vec_size to 16 / sizeof(output)
- Change ICLKernel.cpp to handle broadcast without padding
Resolve COMPMID-3913
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I03e884b250ef5784dc109bff8cf2c96b345d119f
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5450
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Computing the activation in FP32 and then converting in FP16
Resolves: COMPMID-4380
Change-Id: I8a857af65967c8017fb60a358b4f8f0d9fc2e1c2
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5457
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially Resolves : COMPMID-3793
Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I14d6884c34f33a6caee11fc1230f9d2d3ae6c4c1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5425
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Make changes to split the workload into two kernels. One kernel precomputes
mean and variance and the second kernel just loads these precomputed values.
* The new approach runs %30 faster than the original code for NHWC workloads
like 32x192x256.
* Resolves MLCE-337
Change-Id: I8356fcefa2d131ab4dcb32268ce7142421d073e4
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5355
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Resolve: COMPMID-3911
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Id5615b6a8b52030fb611a1a04bcd4664b8232e90
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5451
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Only for NHWC data layout
Resolves: COMPMID-3910
Change-Id: Ie2d71482b3e3b55ac155e9af152032a5de8bbd50
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5388
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- The array initializer for the TILE object cannot always be utilized and so we
do require to manually initialize the TILE with the LOOP_UNROLLING macro
- Resolves COMPMID-4371
Change-Id: I2598354b9fae84c5e3bd11219fffdcdc297215e1
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5417
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve: COMPMID-4370
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I4b2a8bf252405fe9006784fa1769ad5b6e708a71
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5414
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- The cl_image object can be used for the weights
- cl_image can only work for f32/f16
- Fix the implicit padding on the first dimension X
Resolves COMPMID-4341
Change-Id: I04e0901c69e7765c42afceca38c4a840645b9123
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5393
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
The issue is related with clang version, clang 3.9 has the problem, clange 4.0 works. The workaround is to add an extra {} to make this work.
Partial resolves: COMPMID-4348
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Ia079cbb3c44d617b1b42cb2af758b5a8ba1a032e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5399
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4342
Change-Id: I468c6d68c0284e4ec76f22037a697fff7bc5638c
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5391
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4140
Change-Id: I17db0ee596665598d08d4359a373160f21ab9acd
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5390
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
The issue is related with clang version, clang 3.9 has the problem, clange 4.0 works. The workaround is to add an extra {} to make this work.
Resolves: COMPMID-4348
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I2d8fc6400f32af5406fbf2d2556127a53b2ce918
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5392
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch takes advantage of tile_helpers.h and different
data layout input and tmp matrices.
Resolves: COMPMID-4142
Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com>
Change-Id: I5d10bd3f08137414ee7520eef1e6d0aef8cbf160
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5382
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
- Rework Winograd Input Transform 3x3 NHWC using the new macros
- Rework Winograd Input Transform 5x5 NHWC using the new macros
- Rework Winograd Input Transform 7x7 NHWC using the new macros
- The new implementation is also faster than before
- Winograd Input Transform 5x5/7x7 3x faster
Resolves COMPMID-4139
Change-Id: Ia9c8af23a2d47d2db60ec4c44650a63a34ffa0d5
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5358
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Resolves: COMPMID-3909
Change-Id: I00a1705ed202002e2a6053702272181805fa6869
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5360
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Also fixes RoiPoolingLayer not matching reference with Float32 datatype Issue
Tests added to check ROIPooling Layer against reference with both Float32 and Qasymm8 input.
Resolves : COMPMID-2320
Change-Id: Ib86d2e6b3803e74f922a545ea573da02c28e54cc
Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5332
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Resolves: COMPMID-4299
Change-Id: Ie6a52c1371b9a2a7b5bb4f019ecd5e70a2008567
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5338
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4141
Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com>
Change-Id: I1437680029ff25a3a5d4f6f258f30960545056a9
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5299
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
The biases input can be nullptr, hence we need to check before
referencing.
A test is also added to ensure a successful configure and run of Direct
Convolution when there is no bias.
Resolves: COMPMID-4315
Change-Id: I23223efd6ced81215aff490221fb4606945c139b
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5322
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: James Conroy <james.conroy@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch reworks the winograd output transform 3x3 NHWC on OpenCL
- Use utility macros in tile_helpers.h to rewrite the kernel
- Implement the tile utility macro for the activation
Resolves COMPMID-4144
Change-Id: I86a9bb9ea96b9629a18642b56bb63750710e6af5
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5324
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-4296
Change-Id: Ib4e26fd3f9ba66f18ea8ef8b982cc88158564045
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5277
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4143
Change-Id: I71521f50ad47c53c963303251b5da94a7abd5783
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5302
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
The new function can handle different block sizes (M0, N0)
New utility macros have been developed to simplify the work and the
future OpenCL kernel development. In particular the work has been done
to also consider cases with:
- the texture pipe support
- dynamic tensor shape support
Change-Id: Ife4c64baf07517938bb8ad18e6a5f4579345c40f
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5297
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-4151
Change-Id: I46f541efe8c4087f27794d2e158b6c1547d459ba
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5160
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Resolve COMPMID-4288
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I340d18a13f3f00432df3a7d3fc83f93e2591b6d3
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5184
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Full trademarks available in README.md
Resolves: COMPMID-4257
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Ibfba2adf2eef3449433f467464ebd87d7198474d
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5116
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Revert changes in strides > num_dimensions. Set them to 0
- Fix offset calculcation in depthwise 3x3 quantized using select and stride_y for max offset
Resolve COMPMID-4254
Change-Id: Ia99b9637f18b99b1fa3d4b7b4892046027d3e7e5
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5040
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Fix errors when computing tensors with one element only
- Replace Tensor3D with raw pointers so to get rid of offset to first element for NCHW layout
- Add stronger out of bound constraints for NHWC layout
- Set the border size to the input's padding for NHWC
- Fill the strides == 0 with the largest stride, so to avoid accessing empty strides and multiplying by 0
Resolve COMPMID-4088
Change-Id: I751a4e6d7094b3c42306ff7f53af848fd35f19ac
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5024
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Change select condition's data type to satisfy its signature
- Add failing test case with VEC_SIZE == 1
Resolve: COMPMID-4110
Change-Id: I52287bff7a2108f92fd12164e267df6c074d5508
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4978
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- The ARM DOT macro was using wrong variables for performing the dot
product
- K0 could be a non power of 2 values when IFM was not a multiple of 16
- Refactor the test for direct convolution NHWC
Resolves COMPMID-4135, COMPMID-4155
Change-Id: I3a2dc89ef613ae20245cfc28e76ea36c55eaf81d
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4962
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fix ambiguosity with select in OpenCL
- Define a new macro for signed integer data type of the same input data type's size. This is needed because some ops (e.g. logical operators) in OpenCL work in this way
Resolves: COMPMID-4116, COMPMID-4110
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I560eda63fce24abd03d061f78f2f2ca951053fd0
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4898
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Pass the quantized zero value to the opencl kernel
Fixes COMPMID-3908
Change-Id: I6454c2e49f5b150a99178f2d72e0afa0a2990b54
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4884
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Refactor direct convolution for NHWC
- Remove old kernels for NHWC
- Change the heuristic in CLConvolutionLayer.cpp. The new direct
convolution implementation is faster than FFT
Resolves COMPMID-3908
Change-Id: Iee15ce7b04e21847b6eaae5c6d3c1b18180e7efc
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4876
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Explicitly cast scalar to vector for LOGICAL_NOT
Related with COMPUTE-12536 and IVGCVSW-5617
Change-Id: I03accce000f8889fc4fb88c42c3c87845acb4f42
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4874
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|