Age | Commit message (Collapse) | Author |
|
- Adds Qasymm8 and Qasymm8_signed support to the 3d pool operator
Resolves: COMPMID-4669
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I36038c2b7c4f36baf67f7aae801356890e104538
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/410496
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7391
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves: COMPMID-5156
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: I434586ac72d0f5a530e19108e6c5c319497c4fe0
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7411
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- For NDHWC layout
- For F16 and F32 data types
- Mixed Precision stil not supported
Resolves: COMPMID-4670
Signed-off-by: ramy.elgammal@arm.com
Change-Id: I0e14a13e4625569e8e5ee67e6033bd1efe0da469
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7262
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5151
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ic4024d5cd4819fe917a1d49621f1866ae2e90a37
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7260
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- pass tensor's dimensions at runtime rather than compile time
- Add guard macro to compile only kernel of internest
Resolves: COMPMID-5121
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I76b7c0cf56d803f58ebff5494c904ace2a86ef5a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7097
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- pass tensor's dimensions at runtime rather than compile time
- Add guard macro to compile only kernel of internest
Resolves: COMPMID-5120
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I87c3b56ce0cd3c97ffdeabdd9c5d433f361bb005
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7101
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove CLRemapKernel.
- Remove NERemapKernel.
Partially resolves COMPMID-4984
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: Ia61f9ac7447695d81178701cf0e9b7625a91eccc
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7056
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- pass tensor's dimensions at runtime rather than compile time
- Add guard macro to compile only kernel(s) of internest
Resolves: COMPMID-5119
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Ib01098e397011a1201c2800c62a8954ec70e63e8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7083
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- pass tensor's dimensions at runtime rather than compile time
- Add guard macro to compile only kernel of internest
Resolves: COMPMID-5118
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Ie42c3c07fdd817ce62e7cad354381bc22c6e9264
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7058
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
This reverts commit 10e88a7351 "Rework gemm_mm_reshaped_only_rhs_ kernels with new macros"
Resolves: COMPMID-5095
Signed-off-by: Ramy Elgammal<ramy.elgammal@arm.com>
Change-Id: I46e167882f072e7508b6101d295accb6e089e740
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7045
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Rework gemm_reshaped_rhs_only with new TILE macros
- Fuse post ops in gemm_reshaped_rhs_only
Resolves COMPMID-4890
Change-Id: I944948ecec6d08deaf3545b80cd3eeac26e44205
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6944
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
|
|
Resolves COMPMID-4892
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: I52f23ca293506fc693ae829daccc6e889a050752
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6833
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Delete old NCHW ClDirectConv2d kernels.
- Merge all kernels on a single file.
- Removed padding from ClDirectConv2dKernel
Resolves COMPMID-4721
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: I624d218fb770e7b5f3c0acd4e85a21ae48470f55
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6779
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Resolve COMPMID-5004
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ib3e1b5a891234316c411ea9825ec10c68c4ab5a3
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6788
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
|
|
- Pass arguments at runtime
- Rework ClConv2D heuristic to select direct convolution when OFM < IFM
also for small kernel sizes
Resolves COMPMID-5000
Change-Id: I9b538e29093829bc366d24d1e904341c247fa22b
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6771
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- In the dwc_native_fp_nhwc.cl, loop unrolling should only be enabled
when kernel height is less than 5.
- No performance regression experimented
- The patch reduces the compilation time required for the kernel
Resolves COMPMID-4887
Change-Id: I93188b9764cf7d1ad34ac164694f6f1fd37a90e8
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6744
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-4891
Change-Id: Ifdf2a0eaed23347a1b4465ea8d58c11b72083952
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6741
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
- Pass M,N,K at runtime as kernel parameters
- Add a guard macro to compile only kernel of interest
- Move reshpaing kernels to gemm_utils.cl
- Remove the fallback reshaping kernel with Y-Padding support
Resolves: COMPMID-4888
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Ida3851326f0b77e410633271de9ecca106e37931
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6662
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4889
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I4a88082b13865fdaeaba1b7216503cd640aa54df
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6680
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Pass source and destination tensor dimension info at runtime
Resolves: COMPMID-4887
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Ib7c9f3ce6fb7cef600f7b0cd0fadafa4fa6888a1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6635
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
- Add macro guard for different kernels in scale.cl
- Rework TENSOR4D to the new format
- Pass scale_x and scale_y at runtime
Resolves COMPMID-4886
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: Ib904a703d511fb8260618057ac92e5ea9efeee2b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6619
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
post ops
* Add validate tests
* Restrict post ops support in ClGemmConv2d to only those that do not
need im2col or col2im. In practice this means we only support post ops
in conv1x1 with stride = 1, dilation = 1 and data layout = NHWC
Resolves COMPMID-4435
Change-Id: I1fdf0c5d565a4624857250075ac76db35c2f383b
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6573
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- ClGemmMatrixMultiplyReshapedKernel
- ClGemmMatrixMultiplyNativeKernel
- ClGemmMatrixMultiplyReshapedOnlyRhsKernel
Resolves: COMPMID-4713
Change-Id: I3adcb1b3d4af37ebcbc3bee19cc1845885d08600
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6553
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Revert "Remove padding in FP Cl Gemm kernels"
This reverts commit 48717a3d38fef8d316cd4b9fd9a3bc1a43db736b.
* Allow different boundary row handling strategies across native,
reshaped and reshaped_only_rhs kernels by introducing a
ELTWISE_OPERAND_ROW parameter to the macro
Resolves COMPMID-4919
Change-Id: Icefc23c0760a6abb838fef1d0d5bda06b07c79e3
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6569
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
ClGemmMatrixMultiplyNativeKernel Part 3
Partially resolves: COMPMID-4435
Change-Id: Ifc5affa3a24a70942ca2d001380205df09b03ad7
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6550
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Remove rhs and bias padding in ClGemmMatrixMultiplyNativeKernel
* Rework ClGemmMatrixMultiplyReshapedOnlyRHSKernel to use the same
padding boundary condition as the other kernels
Partially resolves COMPMID-4435
Change-Id: I1c17af9cca0b5cb3be087ce160948b7b0e62d297
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6549
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
This interface supports the fusion of multiple elementwise operations
Partially resolves: COMPMID-4435
Change-Id: If68dd7dd98dcf239fde7cb1f0a4a6d4d1e899a6f
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6483
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4663
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I5c3c1cffed5385c06b789543318f7f4d6096987e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6468
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
|
|
Resolves COMPMID-4446
Change-Id: I1d3c2391b67681f4d3af440826aa95b47a1288a6
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6444
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Fixed the issue in NHWC Neon
* Fixed the rounding error in CL
* Added a new test case to reproduce the problem
* Resolves COMPMID-4831
Change-Id: I1613168cad580ca5acefe8ba340130af05cffaff
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6454
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I4d48f1b8eba6681a9de0ae5f1fd8a4ad1edf7fe8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6439
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4660
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ibd66ec1eb6faa60086981b1e3a9c12561df3445f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6420
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Resolves COMPMID-4805
Change-Id: I0acd4479f196cf9518995a60d3b57a9a49e0db57
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6413
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-by: Freddie Liardet <frederick.liardet@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
- The out-of-boundary condition was performed also for PARTIAL_STORE_N0
= 0
Resolves: COMPMID-4774 COMPMID-4771
Change-Id: I0d7e078c67615b513ffeb66860f224999b5135fa
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6302
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
The new kernel performs the computation on multiples elements. The
OpenCL kernel has been re-implemented using the new TILE macros
Resolves COMPMID-4803,COMPMID-4804
Change-Id: Iac8fead65e21b64567a05dbc4fbaa61d362443f9
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6235
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4450
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I6f280d5d66ec43fb5cb06c83fe15a1f227ad165d
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6232
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
When calling vload_partial, the macros were overriding the first values with a hidden double assignment
Resolve COMPMID-4792
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I96bca60ae546fc34a71e69d5c471581a472d8ddf
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6231
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Create new macros for loading values from memory while being aware of
boundaries of the tensor to not generate page faults.
Resolves: COMPMID-4447
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ia5fd0a5dcb40942bccd5e686307d0055e1a1dd82
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6226
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
This reverts commit 50335fd3d0734157382741fcf1bfdaf630c60c4b.
Resolves COMPMID-4792
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Change-Id: Ia6580143d9cf5a7bd5c87ca4214022f7c241ec6f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6214
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Simplify NCHW kernel structure by removing old optimized paths
- Merge quantized with fp kernels
Resolve COMPMID-4722
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I79016b119619aed6a6193295601cd6517f14b88c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6183
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
* Calculate border using both norm size and vec_size_x
* Expose reference tensor printer
Resolves: COMPMID-4793
Change-Id: I7bd8e49779baf7d6848271757bc7993aa1ed2960
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6201
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Create new macros for loading values from memory while being aware of
boundaries of the tensor to not generate page faults.
Resolves: COMPMID-4447
Change-Id: If9a455291e395ebd9070ebe5e120b3064d8fab29
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6168
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Merge quantized kernels with fp for bilinear interpolation (both NCHW and NHWC)
- Pass dimensions at compile time rather than at run time
- Use tile-based approach to rework the NCHW kernels
- Remove unused functions/files
Resolve COMPMID-4723
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ifcdf02beb9daa9f318395751b3c85eb2fe874082
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6138
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
The Following kernels have been split into nchw/nhwc kernels files:
- batchnormalization_layer
- batch_to_space
- channel_shuffle
- depth_to_space
- dequantization_layer
- im2col
- normalization_layer
- normalize_planar_yuv_layer
- normalize_planar_yuv_layer_quantized
- pooling_layer
- pooling_layer_quantized
- remap
- reorg_layer
- scale
- scale_quantized
- space_to_batch
- space_to_depth
- upsample_layer
- winograd_filter_transform
- winograd_input_transform
- winograd_output_transform
The following kernels have been moved to nchw folder:
- direct_convolution1x1
- direct_convolution3x3
- direct_convolution5x5
- direct_convolution_quantized
- prior_box_layer
The following kernels have been moved to nhwc folder:
- direct_convolution
- dwc_native_fp_nhwc
- dwc_native_quantized_nhwc
The following kernels have been removed:
- sobel_filter
While the rest kerenls have been moved to the common folder.
Partially resolves COMPMID-4453
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: Ic327ac935687ec351c610c65a3c6357f364a5a58
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5919
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Fix warning found by oclgrind. Also remove duplicated code.
Resolves COMPMID-4675
Signed-off-by: Freddie Liardet <frederick.liardet@arm.com>
Change-Id: I6ad56cc0130b5df936f1e070db116695269317df
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5974
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
A typo was causing errors when building some CL kernels.
Resolves: COMPMID-4637
Change-Id: I7d1821e1a046ef8ccd306f72afe192732dd2ad1e
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5944
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add in-place calculation support in ClArithmeticKernel, ClSaturatedArithmeticKernel and ClMulKernel
- Add in-place test cases
Resolves: COMPMID-4431
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Id484bdb76b74478a33fedb471ae0c7f799c599f6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5885
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
To reduce the risk of having a long OpenCL kernel, we limit the loop
unrolling on the kernel height. In particular, we unroll only if the
kernel height is less than or equal to 5
Resolves COMPMID-4604
Change-Id: Iece787989f36afb90f1c7676b53d9015e652bdbd
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5916
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Allows only implementations where inputs/output are of the same data
type and removes legacy Computer Vision ones.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ia2b3d23a04236aab682f0c36a1110a30f7c06d1c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5900
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute
- Remove specialized kernels for 3x3 NHWC
- Simplify CLDepthwiseConvolutionLayer.cpp to call just the native
implementation for both floating-point and quantized data types
- Develop two parametric opencl kernels for depthwise convolution layer NHWC
(floating-point and quantized)
- Add support to export the weights to cl_image
- Extend test for depthwise convolution on opencl
Resolves COMPMID-4417
Change-Id: Ibe533f79c2860f9cac8e921895d5a8f947753a5c
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5893
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|