aboutsummaryrefslogtreecommitdiff
path: root/arm_compute/core
AgeCommit message (Collapse)Author
2021-10-12Remove padding in cpuPool2d NCHWFreddie Liardet
Remove padding from all cpuPool2d NCHW kernels (FP16,FP32 & Quantized) Resolves: COMPMID-4728, COMPMID-4823 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: Ida619f67cd6606b33828f2d9dee925aeb794cc50 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6358 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-10-07Add support for 5D data layout indexingGiorgio Arena
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ib346bb6b90d2220ec5934c83a9a1f0cd540b8731 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6377 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-09-29Add support for non-constant weights and biases in CpuFullyConnectedGiorgio Arena
Changing the approach for specifying that weights and biases tensors are non-constant by making it a member of TensorInfo rather than an option of the functions. Resolves: COMPMID-4222, COMPMID-4811 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I9b0081ccbcf8271ce029ba6755563d64c59e1d32 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6313 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-23Use std::vector instead of Coordinates for ITensorInfo's _dims_stateGiorgio Arena
Resolve COMPMID-4798 Change-Id: I2661de88640dcfee92d207f7e884c066eb19ab94 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6306 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
2021-09-22Update OpenCL header file to version 2020.12.18Sheri Zhang
Resolves: COMPMID-4656 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I7735b9828736baa7cdc4690e191a489c824530c6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6280 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-22Provide logging for configure functions in all NEON functionsramelg01
Partially Resolves: COMPMID-4718 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I655268c57fa126d9c99981c49d345a3aac75646e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6286 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2021-09-16Revert "Add support for non-constant weights and biases in CpuFullyConnected"Pablo Marquez Tello
This reverts commit aed63ee175e0d64c934389e9d1b2edd0cb1a5cdd. * Resolves COMPMID-4812 Change-Id: I16919e2f3b22c868ae146d0d10dae97a80e1ba46 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6266 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-15Adds Conv3d reference implementation support.Adnan AlSinan
Expands the interface with the following items: - Size3D Class. - Conv3dInfo Struct. - Padding3D Struct. - Add 'NDHWC' to supported Tensor Data Layouts. - Add function to compute expected size of Conv3d. Resolves COMPMID-4658 & COMPMID-4657 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: Ic7452c48461eedaa38eaf3ac458f54b031e7dfa8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6187 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-09-07Add support for non-constant weights and biases in CpuFullyConnectedMichele Di Giorgio
Changing the approach for specifying that weights and biases tensors are non-constant by making it a member of TensorInfo rather than an option of the functions. Resolves: COMPMID-4222 Change-Id: I96e6f3868f51785c9700a3ef6a1fe7b05747862c Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6162 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-09-01Printing operators parameters, currently for CpuAdd operator only.Ramy Elgammal
Resolves: COMPMID-4718 Change-Id: Id4dd762cd1b759bb814b9d0b1ea0c9ba4dfbae6f Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6139 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-01Fixed compiler errorPablo Marquez Tello
* Missing noexcept causes compilation to fail on GCC 9.3.0 * Resolves MLCE-595 Change-Id: I960608dbeaacac3699465da4b75740237d65559c Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6182 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-08-24Re-use auxiliary memory withing CpuWinogradConv2d operatorsGeorgios Pinitas
Input/Output transformation operations are independent and done in different time-steps of the algorithm, this memory can be re-used between this transformation stages. Moreover, reduce the allocation when extracting workspace sizes for Winograd trasformations. There is a mix return of sizes in bytes and elements, thus ensure the correct is in place. storage_size() member functions return elements while working_space() function bytes. Resolves: COMPMID-4781 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I705445ba7ca818cead48369db3cacd49684c7192 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6145 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-08-13Avoid releasing weights if they are used by multiple functionsGeorgios Pinitas
Resolves: COMPMID-4769 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Iccadcbd68b0fd84ed3bf212e358a4ea944084a40 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/349845 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6107 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-08-09Remove TODO, FIXME and COMPMID comments where possibleFreddie Liardet
Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: I968787603927bcfbeacb110570eb488061ee3e43 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6058 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-22Update GEMM assembly kernelsGeorgios Pinitas
- Introduce Fp32 kernels with internal calculations in Bfloat16 when fast_mode is enabled - Improve kernel selection heuristics Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I68a9e7e862b6fd2721b46e0d7cc791091c4ab279 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5965 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-16Avoid multiple Rhs matrix transformation on ClGemmGeorgios Pinitas
ClWinogradConv2d was performing Rhs transformation on every step impacting the performance. Adds scope logging support through ARM_COMPUTE_LOG_MSG_WITH_FUNCNAME Resolves: COMPMID-4596 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ib329d3bc8d8aa21abae9fabfe61de35cc84d4819 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5925 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-06Fix manual LOOP_UNROLLINGGian Marco Iodice
The issue is caused by the number of iterations passed to LOOP_UNROLLING. When we use the manual LOOP_UNROLLING, the number of iterations must be less than or equal to 128. To overcome this problem, we create a utility function to check if any of the critical iterations (kernel dimensions) are beyond that limit. If so, the utility function, disable the manual loop unrolling. Resolves COMPMID-4609 Change-Id: I7221c967609e462a5abd1cbb74e2a120f344fcb3 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5913 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-02Rework OpenCL Depthwise ConvolutionGian Marco Iodice
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute - Remove specialized kernels for 3x3 NHWC - Simplify CLDepthwiseConvolutionLayer.cpp to call just the native implementation for both floating-point and quantized data types - Develop two parametric opencl kernels for depthwise convolution layer NHWC (floating-point and quantized) - Add support to export the weights to cl_image - Extend test for depthwise convolution on opencl Resolves COMPMID-4417 Change-Id: Ibe533f79c2860f9cac8e921895d5a8f947753a5c Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5893 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-30Revert "Rework OpenCL Depthwise Convolution"Gian Marco Iodice
This reverts commit 561c176598cd14245e2e7918fdf136d1c888d1da. Reason for revert: <validation> Change-Id: I6f2d61c27520439bb538e9265736532104b24cf8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5127 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Port the ClGemmLowp kernels to the new APIGeorgios Pinitas
Ported kernels: - CLGEMMLowpMatrixMultiplyNativeKernel - CLGEMMLowpMatrixMultiplyReshapedKernel - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel - CLGEMMLowpOffsetContributionKernel - CLGEMMLowpOffsetContributionOutputStageKernel - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel - CLGEMMLowpQuantizeDownInt32ScaleKernel Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I9d5a744d6a2dd2f2726fdfb291bad000b6970de2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5870 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Port NEGEMM to memory injecting interface (Part 1)Michele Di Giorgio
- Start porting NEGEMM to the new API - Port NEGEMMInterleave4x4Kernel to the new API - Port NEGEMMMatrixAdditionKernel to the new API - Port NEGEMMTranspose1xWKernel to the new API - Remove padding from NEGEMMMatrixAdditionKernel - Remove unused INESimpleKernel and ICPPSimpleKernel Partially resolves: COMPMID-4402 Change-Id: I63edadddfe00a54586e5384d6a0211db25ae9042 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5857 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Set up the framework to choose the default LWSGiorgio Arena
Resolve COMPMID-4486 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ib38b7943bd776a6d75d1da163908724c49eae73d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5864 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-28Add code to detect Mali(TM) G31Pablo Marquez Tello
* Fixed a problem which made the library choose the Valhall path instead of Bifrost * Set block sizes to 4 for u8 on G31 * Resolves MLCE-463 Change-Id: I577f57ec96f740db15d6640e34172318a82c5778 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5858 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Georgios Pinitas <georgios.pinitas@arm.com>
2021-06-24Rework OpenCL Depthwise ConvolutionGian Marco Iodice
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute - Remove specialized kernels for 3x3 NHWC - Simplify CLDepthwiseConvolutionLayer.cpp to call just the native implementation for both floating-point and quantized data types - Develop two parametric opencl kernels for depthwise convolution layer NHWC (floating-point and quantized) - Add support to export the weights to cl_image - Extend test for depthwise convolution on opencl Resolves COMPMID-4417 Change-Id: I253dd5d959a70783c82e62b1771a5e9f91621cb0 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5806 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-06-23Create core library using high priority operatorsMichalis Spyrou
A smaller core library is created using a subset of the operators. Changed the structure of filelist.json in order to include more information about the kernels and make the selection easier. Resolves: COMPMID-4514 Change-Id: I079ca7d8e64346174eebdd13b834e1dd4dc36ca2 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5786 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-18Integrate improved CPU depthwise convolution kernelsMichele Di Giorgio
* Replace assembly kernels for depthwise convolution with more optimized ones. * Add int8 assembly kernels. * Fix implicit padding on optimized kernels Resolves: COMPMID-3867, COMPMID-4361 Change-Id: I0b0867e05f61be4f368f62190d55e14d0ab3ebf2 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5622 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-06-15Add CPU discovery capabilities.Georgios Pinitas
Resolves: COMPMID-4500 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I008c51934ef813fb1f489b531288c4419e701955 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5799 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-15Add NHWC support to CLRemapFrederick Liardet
Add NHWC support to CLRemap, also add relevant tests. Partially resolves COMPMID-4335. Change-Id: I119bea99be497fb85d5cd83a10f8d4e8e1f97f17 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5773 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-01Optimize int8 arithmetic addition on CPUGiorgio Arena
Avoid accessing quantization info from TensorInfo in leftover loop. Use the already available UniformQuantizationInfo instead Create another version of the quantize utility function which assumes RoundingPolicy::TO_NEAREST_UP. This allows us to call std::lround() and avoid some overhead Resolve COMPMID-4546 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ib481a586f879b7e937e3d54ba11100d0a37ef277 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5722 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-25Remove used auxilary tensors from ClGemm's tensor packSang-Hoon Park
When the used auxilary tensors are kept in the tensor pack, the pointers in the tensor pack can point memory space that has been already freed. In some cases, this leads crashes or unpredictable behaviour. Resolves: COMPMID-4536 Change-Id: I4cc83310dfe084d477a5f9bece216228c3edb172 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5710 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-20Add support for dynamic weights in CL FullyConnected layerGiorgio Arena
Make GEMM use its native version if weights are dynamic. This ensures no reshape gets performed on the weights tensor Enable dynamic weights tests for the OpenCL backend Resolve COMPMID-4223 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Iccc4806701772cede23e24df09c786914d00034c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5652 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-05-18Port CLGEMM to memory injecting interfaceGeorgios Pinitas
Moves the following kernels: - CLGEMMMatrixMultiplyKernel - CLGEMMMatrixMultiplyNativeKernel - CLGEMMMatrixMultipluReshapedKernel - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel Moves the following functions - CLGEMM Introduces facilities to easy handling of auxiliary temporary buffers under then new run interface. Such are: - CLAuxTensorHandler: That allows wrapping of workspace buffers memory to CLBuffer objects - Ability to inject TensorInfo to allocator without transferring ownership. This reduce the copy overhead if needed. Resolves: COMPMID-4188 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I7055435d831b05b749b26302082e4ac45f26dfb0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5498 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-13Fix Pooling Layer Bug when input is 1xN sizeFreddie Liardet
Return error in pooling layer when any calculated output dimension is less than 1. Simplify use of pooling layer output dimension values in CpuPoolingKernel.cpp. Remove some invalid tests in cpu/gpu pooling layers. Resolves COMPMID-4358. Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: If8f8ffec579d3eca1c27a45e5b0b684a77103cff Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5559 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-12Remove unused CLCoreRuntimeContextGeorgios Pinitas
CLCoreRuntime context is currently unused and is planned to be replaced by the Context infrastructure Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic2874800960ca954f647e8867e7db951ce823e1c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5571 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-11Add constant_weights boolean to FullyConntectedLayerInfoMichele Di Giorgio
This is needed to enable support for non-constant weights in Fully Connected Layer, useful in NLP use cases where Fully Connected layers weights come from other nodes in the graph and therefore change between runs. Resolves: COMPMID-4220 Change-Id: I0c2fe97eeb7554efac7ea9a6d6b7243332615052 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5579 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-05-05Creates ClKerneLibrary to serve just as a kernel container without build logicGeorgios Pinitas
Moves all the kernels out of the legacy CLKernelLibrary into a separate class that its planned responsibility is to only handle the kernel strings. Currently CLKernelLibrary mixes kernel inventory and building logic, thus coupling multiple responsibilities. By separating compilation logic, the kernel library removes any dependencies from the OpenCL and simplifies maintainance but also increases flexibility. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Idcd491783b5a413bd944d385956075ead0204362 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5572 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-28Add Queue supportGeorgios Pinitas
Queues are responsible for scheduling operators and performing other runtime related activities like for example tuning. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I0366d9048470d277b8cbf59fa42f95c0ae57c5c9 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5487 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-21Add support for CLVKMichalis Spyrou
This patch enables CLVK through the graph API and inside the CLScheduler. By default the Native platform is selected. Selecting CLVK can be done via --target=clvk. Resolves COMPMID-4205 and COMPMID-4206 Change-Id: Ic60744980c6b8a60e776627ea677ed46be88f656 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5475 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-04-20Update assembly codeMichalis Spyrou
This patch brings performance uplift on Cortex-A35. Resolves: COMPMID-4316 Change-Id: I2b9c02a599373f780dd1b981b821e33bd59a3422 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5461 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-20CLDepthwiseConvolutionLayer rework - Part 1Giorgio Arena
Remove the reshaped variant for CLDepthwiseConvolutionLayer 3x3 NHWC Quantized - Remove kernel selection by GPUTarget - Remove unused quantized support from the NHWC kernel - Remove CLDepthwiseConvolutionLayerReshapeWeightsKernel - Remove OpenCL kernels for reshaped dwc 3x3 quantized and weights reshape - Remove the "_bifrost" suffix in common OpenCL kernel - Remove the ICLDepthwiseConvolutionLayer3x3Kernel common interface Resolve COMPMID-3864, COMPMID-3907 Change-Id: Icfac0fb6c00e214985beb05dad7c0cdbbee7d830 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5447 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Remove experimental tracing featurePablo Marquez Tello
* This commit removes the tracing code which has not been maintained for a few releases. * Resolves MLCE-445 Change-Id: I14793c82fe58ffef0cf936edf4af077b5dde85f8 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5455 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-19Port DepthwiseConvolution to new APIMichalis Spyrou
Resolves: COMPMID-4185 Change-Id: Ib5f22356356a022d567bb18d44ea272b62d10ebf Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5424 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-15Fix validation bug in release mode for armv7Giorgio Arena
Resolve COMPMID-4377, COMPMID-4379 Change-Id: I302f08b5bf0afb5295d31843fea20181d9283658 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5435 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-07Implement Fanout mode in CPPSchedulerSiCongLi
This new scheduler mode is implemented to reduce runtime overhead on high thread counts by distributing the scheduling work to all threads. The fanout mode should only be enabled on high thread counts (e.g. > 8 threads). Alternatively the mode can be forced by setting the environment variable ARM_COMPUTE_CPP_SCHEDULER_MODE to be either "linear" (default) or "fanout". Note that on bare-metal this functionality is turned off but it does not matter as only multi-threading is not supported on bare-metal. Resolves COMPMID-4349 Signed-off-by: SiCongLi <sicong.li@arm.com> Change-Id: I46e2fab83ea24e616c82ae94dca7b2e72a73c7b8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5352 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-06Add tensor related data structures for the new APIGeorgios Pinitas
Adds the following: - TensorDescriptor: which is responsible for holding the information needed to represent a tensor (e.g. shape, dimensions, etc) - Tensor: an aggreate object of a descriptor and a backing memory - TensorPack: A map of tensor that can be passed to operators as inputs/outputs Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I02734ac6ad85700d91d6e73217b4637adbf5d177 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5260 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-31Fix trademarks throughout the codebaseMichele Di Giorgio
Resolves: COMPMID-4299 Change-Id: Ie6a52c1371b9a2a7b5bb4f019ecd5e70a2008567 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5338 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-03-31Remove Computer Vision generic interfaces and typesGeorgios Pinitas
Removes: - reference validation routines - CV related types and structures - CV related interfaces Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I3a203da12d9b04c154059b190aeba18a611149a9 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5340 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-26Remove GLES-related codeMichele Di Giorgio
Change-Id: I208281d6e9ec15f9dba03cfbdc36ba2bf072d592 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5314 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-03-25Avoid -O3 optimizations on TensorShape's copy for linux armv7a in release modeGiorgio Arena
Resolve COMPMID-4286 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I2f658f1b366c6bccada9b81de1f310602c41a161 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5176 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-23Fixed compiler errorsPablo Marquez Tello
* Some compilers fail to build due to the inconsistent use of the noexcept clause Change-Id: I1f44abec84d8d0c8dd45662d1e309d006dcf9b64 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5281 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>