Age | Commit message (Collapse) | Author |
|
ClWinogradConv2d was performing Rhs transformation on every step
impacting the performance.
Adds scope logging support through ARM_COMPUTE_LOG_MSG_WITH_FUNCNAME
Resolves: COMPMID-4596
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ib329d3bc8d8aa21abae9fabfe61de35cc84d4819
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5925
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Resolved COMPMID-4645
Change-Id: I69a3844563257dbaefd20720fd6c12654bf73cf4
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5960
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Nikhil Raj Arm <nikhil.raj@arm.com>
|
|
Resolves: COMPMID-4516
Change-Id: I6a6db66797fa801dfe1238fceca413277241d2ec
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5946
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* We prefer GEMM method for kernel size > 15 when data layout is NCHW because
direct convolution does not support this.
* Resolves COMPMID-4581
Change-Id: Ie18cc96bc6b446fd59e8c8ebb10c3af5ca02c3bb
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5935
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Details:
port NEWeightsReshapeKernel to CpuWeightsReshapeKernel
port NEGEMMConvolutionLayer to CpuGEMMConvolutionLayer
Resolves: COMPMID-4509
Change-Id: I3c7051e2c3f6d808a7ccb898aad70e5b221b9dc3
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5938
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Resolves: COMPMID-4653
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I4f69d42369bf8ab91cd027acf1c97e92ec1ef554
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5948
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4517
Change-Id: I50cb02116a1ab86fc29200371944c4774e830746
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5949
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
A reduction step is performed with 2x2 reduction vectors for fp32 that
can lead to different results. Increasing the tolerance to accommodate.
Resolves: COMPMID-4644
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I53380bbd64e18efdaace9d0e96bb35cbfcea8f04
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5947
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
A typo was causing errors when building some CL kernels.
Resolves: COMPMID-4637
Change-Id: I7d1821e1a046ef8ccd306f72afe192732dd2ad1e
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5944
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add in-place calculation support in ClArithmeticKernel, ClSaturatedArithmeticKernel and ClMulKernel
- Add in-place test cases
Resolves: COMPMID-4431
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Id484bdb76b74478a33fedb471ae0c7f799c599f6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5885
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* This will override the values in the Android.bp file and allow
building for armv8-2a for Android Q and R.
* Resolved COMPMID-4645
Change-Id: I1e1c2a5ae7ee7399a63fffe9eb3009da3bfe0d93
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5937
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Nikhil Raj Arm <nikhil.raj@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Rename to CpuWinogradConv2d
Allow memory to be injected externally
Change-Id: I1f0a26ea533e326a7c63df86e708895c31752a39
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5926
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
To reduce the risk of having a long OpenCL kernel, we limit the loop
unrolling on the kernel height. In particular, we unroll only if the
kernel height is less than or equal to 5
Resolves COMPMID-4604
Change-Id: Iece787989f36afb90f1c7676b53d9015e652bdbd
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5916
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4511
Change-Id: Id6335cb23ef22bba02083498025da0ecb1647714
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5898
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Allows only implementations where inputs/output are of the same data
type and removes legacy Computer Vision ones.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ia2b3d23a04236aab682f0c36a1110a30f7c06d1c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5900
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Consider selection of the Direct Conv2d approach on GPU for floating
point inputs when kernel size is greater or equal to 7x7.
Resolves: COMPMID-4595
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: If851b2e5241dab51127bf9051aa79d210d3f4421
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5894
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Create LayerData type to store information about inputs/outputs of
nodes, also adds convolution specific information.
Resolves: COMPMID-4422
Signed-off-by: Freddie Liardet <frederick.liardet@arm.com>
Change-Id: I1c3be1abe2fb5ed085108ab3d34b14a1b8561d38
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5876
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Details:
Extend NEConvertQuantizedSignednessKernel
Port NEGEMMInterleave4x4Kernel to CpuGemmInterleave4x4Kernel
Port NEGEMMTranspose1xWKernel to CpuGemmTranspose1xWKernel
Port NEGEMMLowpMatrixAReductionKernel to CpuGemmLowpMatrixAReductionKernel
Port NEGEMMLowpMatrixBReductionKernel to CpuGemmLowpMatrixBReductionKernel
Port NEGEMMLowpOffsetContributionOutputStageKernel to CpuGemmLowpOffsetContributionOutputStageKernel
Port NEGEMMLowpOffsetContributionKernel to CpuGemmLowpOffsetContributionKernel
Resolves: COMPMID-4403
Change-Id: I3227f052f25e7b41d073bbea1da8a881fcd78b8e
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5875
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
- C facing interfaces
- Activation layer descriptor
Partially Resolves: COMPMID-4512
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I8530748fff178ca46d61bf2e40d3e543e93f340b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5908
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Check for Fp16 and Bfloat16 support was done using compile time
information. Switch to use runtime information through CPUInfo
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ia3fa20c2b9b93be889667deb2e9366eb7cbe4aea
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5915
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
The issue is caused by the number of iterations passed to
LOOP_UNROLLING. When we use the manual LOOP_UNROLLING, the number of
iterations must be less than or equal to 128.
To overcome this problem, we create a utility function to check if
any of the critical iterations (kernel dimensions) are beyond that
limit. If so, the utility function, disable the manual loop unrolling.
Resolves COMPMID-4609
Change-Id: I7221c967609e462a5abd1cbb74e2a120f344fcb3
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5913
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4510
Change-Id: Ia3e588f599449d975dabad4afafb2974dd44d0ad
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5899
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add Compute Library description
- Add links to download pre-built binaries per platform
- Add links to download pre-built binaries per architecture
Resolves COMPMID-4495
Change-Id: Ieb10cae669e9e66f1b5b982c6b9c4bcc19c95250
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5904
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I94f2b9135a7de78888418f0af33e3e5f78e2a1fa
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5901
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute
- Remove specialized kernels for 3x3 NHWC
- Simplify CLDepthwiseConvolutionLayer.cpp to call just the native
implementation for both floating-point and quantized data types
- Develop two parametric opencl kernels for depthwise convolution layer NHWC
(floating-point and quantized)
- Add support to export the weights to cl_image
- Extend test for depthwise convolution on opencl
Resolves COMPMID-4417
Change-Id: Ibe533f79c2860f9cac8e921895d5a8f947753a5c
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5893
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Redirect validate documentation to configure
- Align header names
- Align class layout
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ia40f67383826a66e9f9a33745d66805551e31a3a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5897
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
- Complete porting of NEGEMM to the new API
Resolves: COMPMID-4402
Change-Id: I14904102b25332dbb4fc048d45dca068a15b6eca
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5890
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Implement in-place graph node mutator for 1x1 depthwise convolution
* Add in-place to validation fixture except for
DepthwiseConvolutionLayerNativeValidationFixture as it would be a
duplicate test otherwise (DepthwiseConvolutionLayerNative test tests
the underlying kernel)
Resolves: COMPMID-4432
Change-Id: Id7f10f5ebdce7d49f550c0b62dbaaab7f5b59d29
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5874
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Add `T_QUANTIZE8_PER_TENSOR` and `T_QUANTIZE8_PER_CHANNEL` that can be
used to perform quantization on tile constructs.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ie8e1efcb895c64715620acf2212b1de9a857ee0a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5891
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I7f3790d8ca592ec46ff2e2de810cf402191a990e
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5881
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4602
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I4727586711a0dbdc859b043cfd55d0335f1c8eb0
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5888
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I96630b45887eaba16ef358b95f3d9ac0b9045157
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5882
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
This reverts commit 561c176598cd14245e2e7918fdf136d1c888d1da.
Reason for revert: <validation>
Change-Id: I6f2d61c27520439bb538e9265736532104b24cf8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5127
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Ported kernels:
- CLGEMMLowpMatrixMultiplyNativeKernel
- CLGEMMLowpMatrixMultiplyReshapedKernel
- CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
- CLGEMMLowpOffsetContributionKernel
- CLGEMMLowpOffsetContributionOutputStageKernel
- CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
- CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
- CLGEMMLowpQuantizeDownInt32ScaleKernel
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I9d5a744d6a2dd2f2726fdfb291bad000b6970de2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5870
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Port NEGEMMMatrixMultiplyKernel to the new API
Partially resolves: COMPMID-4402
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Change-Id: I52b67055dc24bb3a417d6ec5aeeee86e21b74320
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5873
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add loop unrolling on X and use POOL_X and POOL_Y defines for the for
loop
Resolves COMPMID-4573
Change-Id: I33cb825cfb55912ccb0ab9d03bd33a3dab4c8b44
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5872
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Start porting NEGEMM to the new API
- Port NEGEMMInterleave4x4Kernel to the new API
- Port NEGEMMMatrixAdditionKernel to the new API
- Port NEGEMMTranspose1xWKernel to the new API
- Remove padding from NEGEMMMatrixAdditionKernel
- Remove unused INESimpleKernel and ICPPSimpleKernel
Partially resolves: COMPMID-4402
Change-Id: I63edadddfe00a54586e5384d6a0211db25ae9042
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5857
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
CPU micro-kernel to be used was picked during kernel execution.
Move selection during configuration to reduce runtime overhead.
Standardize kernel names as follows:
<simd_tech>_<data_type>_<data_layout>_<kernel_name>
e.g. sve_fp32_nhwc_scale
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I544f1c08c8fef0f130a3bde61882ccb9a1f47f21
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5855
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4486
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ib38b7943bd776a6d75d1da163908724c49eae73d
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5864
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Refactors the CpuInfo extraction code and cleans the usage of it in the
CpuContext class
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Iea47dfdaad431fb49285da92778d6b42cf318f60
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5848
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
* Fixed a problem which made the library choose the Valhall path instead of Bifrost
* Set block sizes to 4 for u8 on G31
* Resolves MLCE-463
Change-Id: I577f57ec96f740db15d6640e34172318a82c5778
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5858
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Resolves: COMPMID-4506, COMPMID-4570
Change-Id: I6d37a06da141f1fcfcaa8525322a319cb0234791
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5824
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I82a298e059a92d82095186d120e4934a8eeb4e3e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5854
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Fix arm_compute_core typo when using fat binary.
Change-Id: Iaf20f7735354ec4db161edbcc9a51bc449fac15c
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5852
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Now core and runtime files are build in the same library,
libarm_compute.so and libarm_compute-static.a. Create a dummy
core library for not breaking backwards compatibility.
Update documentation on fat binaries and new high priority
library.
Fixed issue on scons when using the install_dir feature.
Change-Id: I5a8a6bbe2808243eba07fd135496995259d8702f
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5845
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Resolve COMPMID-4416
Change-Id: I83cdf0de7adaf4d465ffebd494ab913182072485
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5788
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute
- Remove specialized kernels for 3x3 NHWC
- Simplify CLDepthwiseConvolutionLayer.cpp to call just the native
implementation for both floating-point and quantized data types
- Develop two parametric opencl kernels for depthwise convolution layer NHWC
(floating-point and quantized)
- Add support to export the weights to cl_image
- Extend test for depthwise convolution on opencl
Resolves COMPMID-4417
Change-Id: I253dd5d959a70783c82e62b1771a5e9f91621cb0
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5806
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
Add new ARM_COMPUTE_ENABLE_NEON flag
Change-Id: Idc8a19072c41e8fa6c28b50102725c263291a53e
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5844
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
A smaller core library is created using a subset of the operators.
Changed the structure of filelist.json in order to include more
information about the kernels and make the selection easier.
Resolves: COMPMID-4514
Change-Id: I079ca7d8e64346174eebdd13b834e1dd4dc36ca2
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5786
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add in-place computation for elementwise operations at graph level
- Modify support case to test in-place computation for elementwise operations
Resolves: COMPMID-4414
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I5a4de1235dd29a31160e770a16d62f4b98c84ae6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5803
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|