Age | Commit message (Collapse) | Author |
|
As the first phase of making NEGEMMConv2d stateless,
CpuGemmDirectConv2d operator is created. Kernels and
operators used by the operator use TensorInfo pointers
instead of Tensor pointers.
The CpuGemmDirectConv2d isn't completely stateless
because it manages one intermediate tensor internally.
This will be resolved by implementing memory injection
mechanism with the following patches.
Also, weight manager of CpuGemmAssemblyDispatch is disabled
to enable this work.
Implements: COMPMID-4506
Change-Id: Iec3ca6de29d98bef7ea95e8f4473d6dc0024a140
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5672
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
When the used auxilary tensors are kept in the tensor pack,
the pointers in the tensor pack can point memory space
that has been already freed. In some cases, this leads
crashes or unpredictable behaviour.
Resolves: COMPMID-4536
Change-Id: I4cc83310dfe084d477a5f9bece216228c3edb172
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5710
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Make GEMM use its native version if weights are dynamic. This ensures no reshape gets performed on the weights tensor
Enable dynamic weights tests for the OpenCL backend
Resolve COMPMID-4223
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Iccc4806701772cede23e24df09c786914d00034c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5652
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
- Renames DepthConvert to Cast
- Ports both NEDepthConverLayer and CLDepthConvert variants
- Removes legacy shift capability from DepthConvert, allowing only
shifts of 0
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I806a0f8eb23d23502b632c529fda7edde19c8176
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5565
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Moves the following kernels:
- CLGEMMMatrixMultiplyKernel
- CLGEMMMatrixMultiplyNativeKernel
- CLGEMMMatrixMultipluReshapedKernel
- CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
Moves the following functions
- CLGEMM
Introduces facilities to easy handling of auxiliary temporary buffers
under then new run interface. Such are:
- CLAuxTensorHandler: That allows wrapping of workspace buffers memory
to CLBuffer objects
- Ability to inject TensorInfo to allocator without transferring
ownership. This reduce the copy overhead if needed.
Resolves: COMPMID-4188
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I7055435d831b05b749b26302082e4ac45f26dfb0
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5498
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Use of out_of_tensor function to check if parallel instructons can be used safely
Reverting to serial computation otherwise
Resolves: COMPMID-4449
Change-Id: I23a986612e3c5d0367e23e56f1aeedbb1330cffc
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5651
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Resolves: COMPMID-4502
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I00621a772135f65a92aa70e6a19dd5f743915378
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5649
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4503
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ic536f62a9561d709c16d5f9cca28784cb7f281b6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5650
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ic2cd0e46757af2ad7d0b3d733857ba9fdf860864
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5653
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Idae4fc687942f61a1f63f23c9e5538df28888d93
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5632
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Fix corner case in which the quantization is per channel and OFM == 1
The function can safely set the per_channel quantization flag to false since there is only one output channel
This way, the kernel will avoid adding useless padding to the output multipliers and shifts
Resolve COMPMID-4384
Change-Id: Ic03452bfaf52d1be536cd371721adedd2e580a08
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5648
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- Call GEMM-based convolution rather than Direct convolution for
quantized data types
Resolves COMPMID-4497
Change-Id: Idde5808b8a9de69c5196c56538520c0e3d52ba76
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5618
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- Dispatch, WrapperKernel has been renamed and moved
- Header files for assembly kernels have been moved
Partially Resolves: COMPMID-4506
Change-Id: I6c2f391bb95ba1ce7ca195d0efa57b9c3225570f
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5637
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Remove Macros.h from arm_compute and avoid use of headers from src
inside files in the public interface.
Resolves: COMPMID-4525
Change-Id: I58b1b46896d366078cc9df7a0e36d5878064051d
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5641
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
CLCoreRuntime context is currently unused and is planned to be replaced
by the Context infrastructure
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ic2874800960ca954f647e8867e7db951ce823e1c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5571
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4392
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Id0a1694fd479499347d5f820fc228a69cb55f4d1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5610
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Remove TODO/FIXME either already done or won't do.
Partially resolve: COMPMID-4471
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Iec8f25b9bf1b2a70c072dd17d44625fa93e84ed1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5591
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- Call direct convolution when filter size height is greater than or equal to 5
Resolves COMPMID-4439
Change-Id: Ie8ccccc0629eb4c74bd62c4bb4ced47f6898a945
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5589
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Renames the following kernels/functions
- [Cl|Cpu]DequantizationKernel -> [Cl|Cpu]DequantizeKernel
- [Cl|Cpu]Dequantization -> [Cl|Cpu]CpuDequantize
- [Cl|Cpu]QuantizationKernel -> [Cl|Cpu]QuantizeKernel
- [Cl|Cpu]Quantization -> [Cl|Cpu]Quantize
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ic3c5eb3b7fe28f807294d159830eef99c2dd6219
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5566
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Changes the names of the following:
- PixelWiseMultiplicationKernel to MulKernel for all backends
- PixelWiseMultiplication to Mul for all backends
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I88108c2d22c888fce37ea1028863026160b9da97
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5534
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I6c9555f945a4a2a986f1b8dbb2782f2e3994c502
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5537
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change the parallel implementation across the X, now every thread computes one row
Add missing test for MEAN_SUM
Make reduction on any axis != 0 work with num_channels > 1
Resolve COMPMID-3917
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ib0f99540104e3c253bcd1ea637833db533f5e76e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5522
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I5c440f4c6ca4186adcfa926e6b7d924086671f29
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5520
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
This variant with GEMM solves regression problem only partially, but
guarantees precision.
Resolves: COMPMID-4387
Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com>
Change-Id: I7fdb7d6ecf9d67926d994a9f666a80aeac3147a5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/317792
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5470
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Add QSYMM8_PER_CHANNEL support on weight input for CLDeconvolutionLayer.
When weights are per-channel quantized type "Direct" method is always
used.
Also reduce number of QSYMM8_PER_CHANNEL tests for NEDeconvolutionLayer.
Resolves: COMPMID-3438
Signed-off-by: Freddie Liardet <frederick.liardet@arm.com>
Change-Id: I1330cac5142e19d21e322574fb8d912558745b02
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5484
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Resolves COMPMID-4400
Change-Id: I54c33a017c735194fbf4437d1c7df465208bc0ca
Signed-off-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5505
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
|
|
Resolves: COMPMID-4395
Change-Id: Ib3dfdc42e95998c1e5713d6ec1bdaa83299b0360
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5488
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch enables CLVK through the graph API and inside the
CLScheduler. By default the Native platform is selected.
Selecting CLVK can be done via --target=clvk.
Resolves COMPMID-4205 and COMPMID-4206
Change-Id: Ic60744980c6b8a60e776627ea677ed46be88f656
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5475
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Check if biases are null before passing them to
CpuDepthwiseNativeKernel.
Resolves: COMPMID-4389
Change-Id: I29acbdefb75b9e81c293801a265e9b850d8d00b9
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5464
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Remove the reshaped variant for CLDepthwiseConvolutionLayer 3x3 NHWC Quantized
- Remove kernel selection by GPUTarget
- Remove unused quantized support from the NHWC kernel
- Remove CLDepthwiseConvolutionLayerReshapeWeightsKernel
- Remove OpenCL kernels for reshaped dwc 3x3 quantized and weights reshape
- Remove the "_bifrost" suffix in common OpenCL kernel
- Remove the ICLDepthwiseConvolutionLayer3x3Kernel common interface
Resolve COMPMID-3864, COMPMID-3907
Change-Id: Icfac0fb6c00e214985beb05dad7c0cdbbee7d830
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5447
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Remove includes of NEConvertFullyConnectedWeightsKernel.h
Resolves partially: COMPMID-4187
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I1bf246546d3ef53edb4c5a8bc05a0db92d2d3bff
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5418
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Change kernel's vec_size to 16 / sizeof(output)
- Change ICLKernel.cpp to handle broadcast without padding
Resolve COMPMID-3913
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I03e884b250ef5784dc109bff8cf2c96b345d119f
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5450
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
* This commit removes the tracing code which has not been maintained for a few releases.
* Resolves MLCE-445
Change-Id: I14793c82fe58ffef0cf936edf4af077b5dde85f8
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5455
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Make changes to split the workload into two kernels. One kernel precomputes
mean and variance and the second kernel just loads these precomputed values.
* The new approach runs %30 faster than the original code for NHWC workloads
like 32x192x256.
* Resolves MLCE-337
Change-Id: I8356fcefa2d131ab4dcb32268ce7142421d073e4
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5355
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Resolves: COMPMID-4185
Change-Id: Ib5f22356356a022d567bb18d44ea272b62d10ebf
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5424
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Replace std::map with a basic container with std::array
Change-Id: I76f53ca61676ca0e5136ce61a3f3adb10e22b4c3
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5441
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Give the ability to the user to specify an allocator that can be used by
all the internal function tensors. This being a global needs to outlive
all the tensors/functions that are using it.
Resolves: COMPMID-4212, COMPMID-4213
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I251871c242879976819ebca1452404133a8e62d7
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5420
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves: COMPMID-4009
Change-Id: I19ffb61c5c4541134a5028677d2d81228740e454
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5419
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Only for NHWC data layout
Resolves: COMPMID-3910
Change-Id: Ie2d71482b3e3b55ac155e9af152032a5de8bbd50
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5388
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Replace ICLKernel by IClKernel in other unrelated kernels
Resolves partially: COMPMID-4187
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I173b8f2ac645dbfd7d412f4b058c5c9655c229ee
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5402
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4367
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I5e65b62c2ca52cf65950c9c343864ef55b7122c3
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5407
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- The cl_image object can be used for the weights
- cl_image can only work for f32/f16
- Fix the implicit padding on the first dimension X
Resolves COMPMID-4341
Change-Id: I04e0901c69e7765c42afceca38c4a840645b9123
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5393
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves partially: COMPMID-4359 (1/2)
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: Id1859f3cd530eb05f027226e2004cf518778147e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5377
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves partially: COMPMID-4359 (2/2)
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: Id65ef04268575cc9d74be6114e82e116b8ed106d
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5378
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
This new scheduler mode is implemented to reduce runtime overhead on
high thread counts by distributing the scheduling work to all threads.
The fanout mode should only be enabled on high thread counts
(e.g. > 8 threads).
Alternatively the mode can be forced by setting the environment variable
ARM_COMPUTE_CPP_SCHEDULER_MODE to be either "linear" (default) or
"fanout". Note that on bare-metal this functionality is turned off but
it does not matter as only multi-threading is not supported on
bare-metal.
Resolves COMPMID-4349
Signed-off-by: SiCongLi <sicong.li@arm.com>
Change-Id: I46e2fab83ea24e616c82ae94dca7b2e72a73c7b8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5352
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Add QSYMM8_PER_CHANNEL support on weight input for NEDeconvolutionLayer and reference version.
Resolves: COMPMID-3437
Signed-off-by: Freddie Liardet <frederick.liardet@arm.com>
Change-Id: I7c9a28d4d0fea324ed8e5a24fbd0422e5ede145c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5364
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change data layouts after the configure in validation tests for:
- Scale
- Pooling
- FullyConnected
- DepthwiseConvolution
- DirectConvolution
- FFTConvolution
- WinogradConvolution
- GEMMConvolution (Indirect GEMM included)
Extending fixtures
Fixes for new mixed data layout tests
Resolves: COMPMID-4162
Change-Id: I2f2eb2075f7e24ab3872249d88cadb57b82c5dde
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5326
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Also fixes RoiPoolingLayer not matching reference with Float32 datatype Issue
Tests added to check ROIPooling Layer against reference with both Float32 and Qasymm8 input.
Resolves : COMPMID-2320
Change-Id: Ib86d2e6b3803e74f922a545ea573da02c28e54cc
Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5332
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Resolves: COMPMID-4299
Change-Id: Ie6a52c1371b9a2a7b5bb4f019ecd5e70a2008567
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5338
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
COMPMID-1088 has been closed and the FIXME can be removed.
Change-Id: I2ee103ab12e65383a62bfe3fc4aa0ed90c211510
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5341
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|