Age | Commit message (Collapse) | Author |
|
The following operators are now stateless by implementing
memory injection.
- CpuDirectGemmConv2d
- CpuGemmAssemblyDispatch
A test case is added to test if CpuDirectGemmConv2d can
run on different group of tensors with a single configure.
Resolves: COMPMID-4506
Change-Id: I48f44ed41236ca7e18da2de07bdbacc9007a3c5e
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5718
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
|
|
Node fusion mutator adds new nodes to the node list, invalidating the
list iterator during iteration. The fix uses index instead.
Resolves COMPMID-4543
Change-Id: Ibb38e2394c616ac6e458568e73f09fb5243e94fe
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5717
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Add implicit padding test on weights before the configure
Fix problem of considering left padding of weights when using cl image
Resolves: COMPMID-4493
Change-Id: I141d2de68e8bdfcbd6f18209db4f29fcc05305a1
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5689
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
As the first phase of making NEGEMMConv2d stateless,
CpuGemmDirectConv2d operator is created. Kernels and
operators used by the operator use TensorInfo pointers
instead of Tensor pointers.
The CpuGemmDirectConv2d isn't completely stateless
because it manages one intermediate tensor internally.
This will be resolved by implementing memory injection
mechanism with the following patches.
Also, weight manager of CpuGemmAssemblyDispatch is disabled
to enable this work.
Implements: COMPMID-4506
Change-Id: Iec3ca6de29d98bef7ea95e8f4473d6dc0024a140
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5672
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
When the used auxilary tensors are kept in the tensor pack,
the pointers in the tensor pack can point memory space
that has been already freed. In some cases, this leads
crashes or unpredictable behaviour.
Resolves: COMPMID-4536
Change-Id: I4cc83310dfe084d477a5f9bece216228c3edb172
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5710
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I12608eb4d681ea07d9d508226600a11365238a00
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5682
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Make GEMM use its native version if weights are dynamic. This ensures no reshape gets performed on the weights tensor
Enable dynamic weights tests for the OpenCL backend
Resolve COMPMID-4223
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Iccc4806701772cede23e24df09c786914d00034c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5652
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Change-Id: Id98a107d512369d3799961011a84e9cc4d99e775
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5679
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Create right vector for size of 16 elements
Resolves: COMPMID-4532
Change-Id: I74dc89d158a6aa3f50db710cbc002a92156e2263
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5673
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Renames DepthConvert to Cast
- Ports both NEDepthConverLayer and CLDepthConvert variants
- Removes legacy shift capability from DepthConvert, allowing only
shifts of 0
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I806a0f8eb23d23502b632c529fda7edde19c8176
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5565
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Moves the following kernels:
- CLGEMMMatrixMultiplyKernel
- CLGEMMMatrixMultiplyNativeKernel
- CLGEMMMatrixMultipluReshapedKernel
- CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
Moves the following functions
- CLGEMM
Introduces facilities to easy handling of auxiliary temporary buffers
under then new run interface. Such are:
- CLAuxTensorHandler: That allows wrapping of workspace buffers memory
to CLBuffer objects
- Ability to inject TensorInfo to allocator without transferring
ownership. This reduce the copy overhead if needed.
Resolves: COMPMID-4188
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I7055435d831b05b749b26302082e4ac45f26dfb0
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5498
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Use of out_of_tensor function to check if parallel instructons can be used safely
Reverting to serial computation otherwise
Resolves: COMPMID-4449
Change-Id: I23a986612e3c5d0367e23e56f1aeedbb1330cffc
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5651
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Resolves: COMPMID-4502
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I00621a772135f65a92aa70e6a19dd5f743915378
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5649
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4503
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ic536f62a9561d709c16d5f9cca28784cb7f281b6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5650
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- k0 should be set to 16 for quantized data types
Resolves COMPMID-4497
Change-Id: I729a8e2b7cd45762df4fef8b4c8606fe6367adb5
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5654
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ic2cd0e46757af2ad7d0b3d733857ba9fdf860864
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5653
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I092d10534816f5b3717325952033c351b8231380
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5643
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Idae4fc687942f61a1f63f23c9e5538df28888d93
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5632
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Fix corner case in which the quantization is per channel and OFM == 1
The function can safely set the per_channel quantization flag to false since there is only one output channel
This way, the kernel will avoid adding useless padding to the output multipliers and shifts
Resolve COMPMID-4384
Change-Id: Ic03452bfaf52d1be536cd371721adedd2e580a08
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5648
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- Bring the epsilon up to 1e-3 for FP16 (both backends) since it was causing the reference's variance being negative and its square root being NaN
- Bring the epsilon up to 1e-7 for FP16 NEON test for the same problem on the NEON kernel
- Adjust the CL kernel's vec_size when input tensor's width < 16 and use macros agnostic of vector size for sum reduction
- Add previously mismatching tensor shapes
Resolve COMPMID-4354
Change-Id: I823c871aacb72326f90c86b24cb16c3e2d4bd15e
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5630
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- Call GEMM-based convolution rather than Direct convolution for
quantized data types
Resolves COMPMID-4497
Change-Id: Idde5808b8a9de69c5196c56538520c0e3d52ba76
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5618
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- Dispatch, WrapperKernel has been renamed and moved
- Header files for assembly kernels have been moved
Partially Resolves: COMPMID-4506
Change-Id: I6c2f391bb95ba1ce7ca195d0efa57b9c3225570f
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5637
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Remove Macros.h from arm_compute and avoid use of headers from src
inside files in the public interface.
Resolves: COMPMID-4525
Change-Id: I58b1b46896d366078cc9df7a0e36d5878064051d
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5641
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4527
Change-Id: If038d2477d8898d3ee307fe58301fb0b16b64c02
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5640
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Only for NHWC
Instead of starting from the input vector and insert the values in the output,
we now create an output vector by taking from the input tensor. This makes it
easier to store the result in the output with border awareness
Resolves: COMPMID-4445
Change-Id: I77cf2d2d5ff30a383cddb8a3c81c462d7a39fd2e
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5627
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Return error in pooling layer when any calculated output dimension is less than 1.
Simplify use of pooling layer output dimension values in
CpuPoolingKernel.cpp.
Remove some invalid tests in cpu/gpu pooling layers.
Resolves COMPMID-4358.
Signed-off-by: Freddie Liardet <frederick.liardet@arm.com>
Change-Id: If8f8ffec579d3eca1c27a45e5b0b684a77103cff
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5559
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Change vectors' fixed length of 4 in gemmlowp_output_stage_quantize_down_float with its kernel's adjusted vec_size
Resolve COMPMID-4355
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I371ecf114356fb6a8d18c5c3727f09ae247484bd
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5631
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Fix issue where gpu select kernel would not compile when input was 1xN
size.
Also add relevant tests for 1xN input.
Resolves: COMPMID-4357
Signed-off-by: Freddie Liardet <frederick.liardet@arm.com>
Change-Id: Ib5f339379e9cc7ef05605312889dbad34cbfe5dd
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5620
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4339
Change-Id: Ieacd288ca8f69e04d88692a093325a24198ebe3e
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5623
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
When neither OpenCL nor accelerated CPU backends are built then unused
variable warnings are signalled.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I548a440849178c8430f74ab4f75736d081a50903
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5604
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
CLCoreRuntime context is currently unused and is planned to be replaced
by the Context infrastructure
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ic2874800960ca954f647e8867e7db951ce823e1c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5571
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves : <COMPMID-3793>
Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I4a6144ba788ae46d9637987455bec2dff8b6f561
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5586
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4392
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Id0a1694fd479499347d5f820fc228a69cb55f4d1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5610
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- DOT_PRODUCT8_INTEGER8 and DOT_PRODUCT16_INTEGER8 are calling
DOT_PRODUCT4_INTEGER8 without passing DST_DATA_TYPE
Resolves COMPMID-4491
Change-Id: I394bd2f9208489e820885e49ed40e607d6470620
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5594
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Remove TODO/FIXME either already done or won't do.
Partially resolve: COMPMID-4471
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Iec8f25b9bf1b2a70c072dd17d44625fa93e84ed1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5591
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- Call direct convolution when filter size height is greater than or equal to 5
Resolves COMPMID-4439
Change-Id: Ie8ccccc0629eb4c74bd62c4bb4ced47f6898a945
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5589
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Revert "Workaround for compiler error in gcc-9.2 and 9.3"
This reverts commit e81825bf68ebfce21f6839fa59ddb7e22884a206.
* Resolves COMPMID-4462
* Solves issue when using GCC9.2
Change-Id: Id17a5eadb10083ea10665f1f42f21a3af3cafd36
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5587
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Moves all the kernels out of the legacy CLKernelLibrary into a separate
class that its planned responsibility is to only handle the kernel
strings.
Currently CLKernelLibrary mixes kernel inventory and building logic,
thus coupling multiple responsibilities.
By separating compilation logic, the kernel library removes any dependencies
from the OpenCL and simplifies maintainance but also increases flexibility.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Idcd491783b5a413bd944d385956075ead0204362
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5572
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Renames the following kernels/functions
- [Cl|Cpu]DequantizationKernel -> [Cl|Cpu]DequantizeKernel
- [Cl|Cpu]Dequantization -> [Cl|Cpu]CpuDequantize
- [Cl|Cpu]QuantizationKernel -> [Cl|Cpu]QuantizeKernel
- [Cl|Cpu]Quantization -> [Cl|Cpu]Quantize
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ic3c5eb3b7fe28f807294d159830eef99c2dd6219
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5566
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves : COMPMID-3793
Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: Id82e00c784f0a039017fd896f11671bdda2dd4ab
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5530
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
x - x^3/3 is more accurate approximation for |x| < 0.005
than (exp2x - 1)/(exp2x + 1).
Resolves: COMPMID-4098
Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com>
Change-Id: If6f9d7ce4d8d00d36d2dada7ab8f8d9f5b58f5c0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/321354
Tested-by: bsgcomp <bsgcomp@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5563
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Execution window along the X axis needs to be collapsed on the 3rd axis (rather than the 2nd) since there could be implicit padding added along the Y
Resolve COMPMID-4425
Change-Id: I9623a31749b737fea7c623cabdcfbf77cbe8f6dc
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5551
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
vpadd is not correctly converted by some compilers in debug. Therefore we opted
for a serial computation of the elements in the result vector for debug builds
Resolves: COMPMID-4420
Change-Id: I2d32af8568852a419226a409e3849d08e4e649c7
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5536
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Changes the names of the following:
- PixelWiseMultiplicationKernel to MulKernel for all backends
- PixelWiseMultiplication to Mul for all backends
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I88108c2d22c888fce37ea1028863026160b9da97
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5534
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I6c9555f945a4a2a986f1b8dbb2782f2e3994c502
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5537
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
All data type and data layout information for the operators are store in the function header files
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I30b564f7eda6bbd99bf3ad36ddb6639ac118eb8b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/319829
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5531
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Simplify the implementation when the pooling size has the same spatial
dimensions of the input tensor
- Rework the heuristic for F32/F16
- Add test for validating the global pooling path
- Fix compare_dimensions in validation. The validation fails because
we have different
number of dimensions for NCHW and NHWC (e.g. 1,1,2,1(NCHW) ->
2,1,1,1(NHWC)
Resolves COMPMID-4426
Change-Id: Ia53ee659a9fbc3d011f286a8150d1be9d6d2cd05
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5533
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
It doesn't take into consideration the padding.
Change-Id: Ie7a2a0c7aee93fd7007deb6e46f6bec0e7f06066
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5529
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change the parallel implementation across the X, now every thread computes one row
Add missing test for MEAN_SUM
Make reduction on any axis != 0 work with num_channels > 1
Resolve COMPMID-3917
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ib0f99540104e3c253bcd1ea637833db533f5e76e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5522
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add missing config_id in ClPixelWiseMultiplicationKernel
Resolves COMPMID-4410
Change-Id: I26103d3395ac2e03591b97f8de1ffcb91cd3bf21
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5528
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|