Age | Commit message (Collapse) | Author |
|
- Provide support for Add operator
- Auto initialize the destination tensor before testing fusion in conv2d
and elementwise binary ops.
Resolves: COMPMID-5518
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Ibd815020f02b57f88eea7c2921bdcf98605d99c5
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8617
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Resolves MLCE-842
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Change-Id: Iae0521b25a5e6c9cc8046830f397d523dfbcc66e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8542
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-5634
Change-Id: I075de70d509d0c4430b4bcf3f218384e237a3a56
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/453708
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8473
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
|
|
This patch introduces a GEMM routine that is optimized for Arm(R) Mali(TM)-G715 and Arm(R) Mali(TM)-G615
Resolves: COMPMID-5216
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Change-Id: I2e5d7806f5904347185bb3e250f73d73d6669dba
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7914
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5355
Change-Id: I92f73fbe885f28bbe7b07965b90cfd807c93602f
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7745
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
|
|
* Replace arm_dot(_acc) with dot when cl_khr_integer_dot_product
extension is available.
Resolves: COMPMID-5206
Change-Id: I7fd763e2421987584e4dae271008972644ea2f41
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7647
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5269
Change-Id: I4372ea4365d14ead79153e4b08b690a1e20ab0b7
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7531
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves: COMPMID-5156
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: I434586ac72d0f5a530e19108e6c5c319497c4fe0
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7411
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
This reverts commit 10e88a7351 "Rework gemm_mm_reshaped_only_rhs_ kernels with new macros"
Resolves: COMPMID-5095
Signed-off-by: Ramy Elgammal<ramy.elgammal@arm.com>
Change-Id: I46e167882f072e7508b6101d295accb6e089e740
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7045
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Rework gemm_reshaped_rhs_only with new TILE macros
- Fuse post ops in gemm_reshaped_rhs_only
Resolves COMPMID-4890
Change-Id: I944948ecec6d08deaf3545b80cd3eeac26e44205
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6944
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
|
|
Resolves COMPMID-4892
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: I52f23ca293506fc693ae829daccc6e889a050752
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6833
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-4891
Change-Id: Ifdf2a0eaed23347a1b4465ea8d58c11b72083952
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6741
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
- Add macro guard for different kernels in scale.cl
- Rework TENSOR4D to the new format
- Pass scale_x and scale_y at runtime
Resolves COMPMID-4886
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: Ib904a703d511fb8260618057ac92e5ea9efeee2b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6619
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4660
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ibd66ec1eb6faa60086981b1e3a9c12561df3445f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6420
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Fix warning found by oclgrind. Also remove duplicated code.
Resolves COMPMID-4675
Signed-off-by: Freddie Liardet <frederick.liardet@arm.com>
Change-Id: I6ad56cc0130b5df936f1e070db116695269317df
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5974
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
A typo was causing errors when building some CL kernels.
Resolves: COMPMID-4637
Change-Id: I7d1821e1a046ef8ccd306f72afe192732dd2ad1e
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5944
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute
- Remove specialized kernels for 3x3 NHWC
- Simplify CLDepthwiseConvolutionLayer.cpp to call just the native
implementation for both floating-point and quantized data types
- Develop two parametric opencl kernels for depthwise convolution layer NHWC
(floating-point and quantized)
- Add support to export the weights to cl_image
- Extend test for depthwise convolution on opencl
Resolves COMPMID-4417
Change-Id: Ibe533f79c2860f9cac8e921895d5a8f947753a5c
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5893
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Add `T_QUANTIZE8_PER_TENSOR` and `T_QUANTIZE8_PER_CHANNEL` that can be
used to perform quantization on tile constructs.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ie8e1efcb895c64715620acf2212b1de9a857ee0a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5891
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
This reverts commit 561c176598cd14245e2e7918fdf136d1c888d1da.
Reason for revert: <validation>
Change-Id: I6f2d61c27520439bb538e9265736532104b24cf8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5127
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute
- Remove specialized kernels for 3x3 NHWC
- Simplify CLDepthwiseConvolutionLayer.cpp to call just the native
implementation for both floating-point and quantized data types
- Develop two parametric opencl kernels for depthwise convolution layer NHWC
(floating-point and quantized)
- Add support to export the weights to cl_image
- Extend test for depthwise convolution on opencl
Resolves COMPMID-4417
Change-Id: I253dd5d959a70783c82e62b1771a5e9f91621cb0
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5806
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
Change-Id: Id98a107d512369d3799961011a84e9cc4d99e775
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5679
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I092d10534816f5b3717325952033c351b8231380
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5643
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- DOT_PRODUCT8_INTEGER8 and DOT_PRODUCT16_INTEGER8 are calling
DOT_PRODUCT4_INTEGER8 without passing DST_DATA_TYPE
Resolves COMPMID-4491
Change-Id: I394bd2f9208489e820885e49ed40e607d6470620
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5594
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Change kernel's vec_size to 16 / sizeof(output)
- Change ICLKernel.cpp to handle broadcast without padding
Resolve COMPMID-3913
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I03e884b250ef5784dc109bff8cf2c96b345d119f
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5450
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
- The cl_image object can be used for the weights
- cl_image can only work for f32/f16
- Fix the implicit padding on the first dimension X
Resolves COMPMID-4341
Change-Id: I04e0901c69e7765c42afceca38c4a840645b9123
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5393
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Rework Winograd Input Transform 3x3 NHWC using the new macros
- Rework Winograd Input Transform 5x5 NHWC using the new macros
- Rework Winograd Input Transform 7x7 NHWC using the new macros
- The new implementation is also faster than before
- Winograd Input Transform 5x5/7x7 3x faster
Resolves COMPMID-4139
Change-Id: Ia9c8af23a2d47d2db60ec4c44650a63a34ffa0d5
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5358
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
This patch reworks the winograd output transform 3x3 NHWC on OpenCL
- Use utility macros in tile_helpers.h to rewrite the kernel
- Implement the tile utility macro for the activation
Resolves COMPMID-4144
Change-Id: I86a9bb9ea96b9629a18642b56bb63750710e6af5
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5324
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
The new function can handle different block sizes (M0, N0)
New utility macros have been developed to simplify the work and the
future OpenCL kernel development. In particular the work has been done
to also consider cases with:
- the texture pipe support
- dynamic tensor shape support
Change-Id: Ife4c64baf07517938bb8ad18e6a5f4579345c40f
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5297
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|