Age | Commit message (Collapse) | Author |
|
The Following kernels have been split into nchw/nhwc kernels files:
- batchnormalization_layer
- batch_to_space
- channel_shuffle
- depth_to_space
- dequantization_layer
- im2col
- normalization_layer
- normalize_planar_yuv_layer
- normalize_planar_yuv_layer_quantized
- pooling_layer
- pooling_layer_quantized
- remap
- reorg_layer
- scale
- scale_quantized
- space_to_batch
- space_to_depth
- upsample_layer
- winograd_filter_transform
- winograd_input_transform
- winograd_output_transform
The following kernels have been moved to nchw folder:
- direct_convolution1x1
- direct_convolution3x3
- direct_convolution5x5
- direct_convolution_quantized
- prior_box_layer
The following kernels have been moved to nhwc folder:
- direct_convolution
- dwc_native_fp_nhwc
- dwc_native_quantized_nhwc
The following kernels have been removed:
- sobel_filter
While the rest kerenls have been moved to the common folder.
Partially resolves COMPMID-4453
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: Ic327ac935687ec351c610c65a3c6357f364a5a58
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5919
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4339
Change-Id: Ieacd288ca8f69e04d88692a093325a24198ebe3e
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5623
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Remove out-of-date and unmaintained LocallyConnectedLayer for both NEON
and OpenCL.
Resolves: COMPMID-3924
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ia61398ed8cfa3876f41c1b342c4a80d1cca0ca83
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4634
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-3977
Change-Id: I222e0d1726993e54699646323820fc4ae53ab520
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4530
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Remove default definition for STORE_BLOCK_BOUNDARY_AWARE to avoid elusive bugs
* Clean up gemm_mm_interleaved* and gemm_mm_floating_point* kernels
* Relocate to gemm_v1.cl to avoid clashing with new kernels
* Rename compile time arguments to conform with the established
terminology(MNKB), and to facilitate the use of STORE_BLOCK_BOUNDARY_AWARE
Change-Id: Ia85c746b2536cad87257a79685b459b5d2f9a1be
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4329
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add has_pad_y flag in GEMMKernelInfo
- Skip reinterpret as 3D in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel if
has_pad_y = false
- Add test to validate CLGEMMMatrixMultiplyReshapedOnlyRHSkernel with
had_pad_y = false/true
- Configure two variants of CLGEMMMatrixMultiplyReshapedOnlyRHSKernel to
run with has_pad_y = false/true in CLGEMM
- Check if the lhs/dst tensors have pad y. If not, run
CLGEMMMatrixMultiplyReshapedOnlyRHSKernel without padding requirement
Change-Id: I68bb43389789736d676b899ac7c77fd9138babaf
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4248
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
FP16/32
Removed unused N from partial block loading macro
Created utility to assert change in padding
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ifdd30c66dbf5f2842c6b2d939000613d5011708e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4192
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I09f557b5cecafc669e12764e8592457212168d62
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4131
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
CLGEMMMatrixMultiplyReshapedKernel
- Change the interface of STORE_BLOCK_BOUNDARY_AWARE passing the
conditions on Y and X rather than the X/ coordinates. This allows to
use the macro with both GEMM reshaped and GEMM reshaped rhs only
- Remove padding from the output tensor of
CLGEMMMatrixMultiplyReshapedKernel
- Add tests for validating the zero padding requirement
Change-Id: I13263cc71ce065c5be34ed198def320dd5823495
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3712
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove padding requirement for the input tensor of
CLGEMMReshapeLHSMatrixKernel
- Add utility function to load a boundary aware 2d tensor from buffer
- Extend validation for validating the zero padding requirement
Change-Id: I0ac6b1b517d75fd56998f406e0cce97b40918ce1
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3701
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLGEMMMatrixMultiplyNativeKernel
Resolves: COMPMID-3333, COMPMID-3334
* Implement an "overlap load, but don't overlap store" strategy:
- Change STORE_BLOCK_BOUNDARY_AWARE so that the partial block in y
dimension is placed at the beginning instead of at the end.
- Implement 3 auxiliary functions to calculate the lhs, bias and dst
addresses, taking into account the potential partial block in y dimension.
* Remove y load padding from Lhs and Bias tensors in
CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLGEMMMatrixMultiplyNativeKernel
* Modify config tests to assert zero-padding in new dimensions
Change-Id: I8f8585c7c0f543d720c2c91b885417c7dad35af4
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3576
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Also removes some unused code.
Change-Id: I85687c40999c3cdf9e6fccfcd020b0901a9515fe
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3581
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
COMPMID-3338 Remove store padding in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
COMPMID-3336 Remove store padding in CLGEMMMatrixMultiplyNativeKernel
COMPMID-3584 Fix VSTORE to correctly deal with scalar case
* Implement STORE_BLOCK_BOUNDARY_AWARE, as part of the COMPMID-3332
investigation, with the following substantial changes:
- Separate STORE_BLOCK_PARTIAL, STORE_ROW_PARTIAL and VSTORE_PARTIAL
so that this change does not affect kernels not using STORE_BLOCK_BOUNDARY_AWARE.
- Revamp vstore_ext_n to vstore_partial_n, and enhance
VSTORE_PARTIAL to correctly handle both vector and scalar cases
* Remove the store padding (dst tensor) in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
and CLGEMMMatrixMultiplyNativeKernel
* Add configuration tests to check no padding is added by the
configuration.
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: I4f0907867979d8dacedd03b4bcbd2fb19e4f1602
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3522
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
preferred presentation
Change-Id: Ib7dcfcbb24b408999dfae366b9da396485aacf78
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3525
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
COMPMID-3323: Add cl_image support for GEMMReshapedOnlyRHS T
- Added support for cl_image in CLGEMMMatrixMultiplyReshapedInlyRHSKernel (both NT and T kernels)
- Extended the tests for the validating rhs_info.export_to_cl_image = true
- Updated doxygen documentation in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.h
Change-Id: If253794323aac072d84a4d8680b9a2339ab7ad92
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3437
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
The performance regression was caused by a change in the interface
of the OpenCL kernels gemm_mm_reshaped_lhs_*
Change-Id: I030df4975dc040886c17e71710a27137b50edd9b
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3465
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
COMPMID-3321: Add cl_image support for GEMMReshaped NT_T
- Added support for cl_image in CLGEMMMatrixMultiplyReshapedKernel (both
NT and T kernels)
- Extended the tests for the validating rhs_info.export_to_cl_image =
true
- Added utility macros in OpenCL to load data from a OpenCL image object
- Updated doxygen documentation in CLGEMMMatrixMultiplyReshapedKernel.h
- Updated doxygen documentation in CLGEMMReshapeRHSMatrixKernel.h
Change-Id: I953b10e4ef205d1b76dcbc366e5a91fd5a8e1d5c
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3329
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I7335ee07f777087e06ca26f762b2b5e3668362ab
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3175
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
|
|
Using the accumulator type during the activation in gemm for mixed
precision operations as some OpenCL compiler versions complain for
mixing float data types for primitive built-ins.
Change-Id: I1c4b394c57962aeebb541de572614a7e3ee3f223
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/2080
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I5ba90d4de4594ed784c7230aa6b10503be67c001
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1991
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I8adb8850cc5ade49ebc1dbf63401f03d5ecad708
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1983
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I47b84a6f815492e24771d488aa8b29d14e572f40
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1956
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
LHS (t) and not-transpose RHS
Change-Id: I437a00d7213fefd6f4365071b46174d44df8b85c
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1677
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Fused beta*bias in in the old cl gemm kernels
Fused activation function in the old cl gemm kernels
Change-Id: I695fb9189e6d4792010abd256784624982d17d79
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1587
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Fuse activation function in:
CLGEMMMatrixMultiplyNativeKernel
CLGEMMMatrixMultiplyReshapedKernel
CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
Change-Id: I033ace2bdc58903594c9f31175e4b23c4b559f6f
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1565
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
|
|
Change-Id: I714b92ec001fc71172719b67fb66d490538b6948
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1399
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I5bfd38c94a6fd18a1cba2104f7e1b04e7bef6ec2
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1359
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I1d1e1f28fe7022309d72900893e8368820ca0f89
Signed-off-by: giuros01 <giuseppe.rossini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1259
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I527fc97eac51308de601e5d1d50e75e4d89c5ee5
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1158
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I347130f6b5ae8d08b7c5c101b523b158565874a1
Signed-off-by: giuros01 <giuseppe.rossini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1114
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I7203d7e4d5540536b5e6638c81b26a955aa70f5c
Signed-off-by: Usama Arif <usama.arif@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1144
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I3907d151107766dc34749fe5710d7219e810b39f
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/875
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I89403b97503fbb99f6a32f5d62b8c535ab26a7be
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/877
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I6b7f7a406fcb8d64adb221a24d7983e11bcb391e
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/846
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Change-Id: I364c7ec5a43ad391a73429489802b0e679ee0c6e
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/732
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I378f2023f4fa010f195f76716ac07aa86279bfae
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/280
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I01656785fe58e6157469f6c6e66bb7eec53380f1
Reviewed-on: https://review.mlplatform.org/563
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ide950b46c4d41de230c272c7044a03f4f9f237ed
Reviewed-on: https://review.mlplatform.org/548
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Extended CLGEMMMatrixMultiplyReshapedKernel to support more parameters
Change-Id: I4a27f986e3fe2dd071a4ccba5cfa0565f3db39ad
Reviewed-on: https://review.mlplatform.org/495
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Change-Id: I2b0dbfe7d430a8d0f62eb906f0334b16cde9e45b
Reviewed-on: https://review.mlplatform.org/457
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
gemm_reshape_rhs_matrix_nt
Change-Id: Ic935f1947f3f3a60455a8108fcf6f313258abe51
Reviewed-on: https://review.mlplatform.org/425
Tested-by: Anthony Barbier <Anthony.barbier@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Change-Id: I913a7297a0c34a05b1d37eab1489b430423700e8
Reviewed-on: https://review.mlplatform.org/417
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Change-Id: I9af1f7263c6e71e38af97f3112d35044cf60ddf0
Reviewed-on: https://review.mlplatform.org/403
Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
The current implementation is limited just to FP32
Change-Id: I185ab57e483e879d7c301e9cc3033efc8b41e244
Reviewed-on: https://review.mlplatform.org/389
Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
matrix of GEMM/GEMMLowp
Change-Id: I77f2bfcc5d170bcc2428a2f27104942c1ec877d7
Reviewed-on: https://review.mlplatform.org/375
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
matrix of GEMM/GEMMLowp
Change-Id: I8c5fd4c8bcdffda1522c83158981ed92baa045f4
Reviewed-on: https://review.mlplatform.org/364
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
FP mixed precision support added to GEMM kernel used for fp16 winograd conv on Midgard GPUs
Change-Id: I1619beb025fc484a1ac9d3e528d785edabbc7ee6
|
|
Introduced F32 accumulation for F16 winograd gemm and output transform
WinogradConvolution will be available for F16 only if fast math flag is enabled
Change-Id: I215593c205236a0f9669218437bb40b184ec6a4f
|
|
OpenCL
COMPMID-1424 - Add dot product support for CLDepthwise QASYMM8 3x3 NHWC non-unit stride
With this patch we are able to improve the performance of MobileNet v1-qasymm8 by 37 %
Tried to use the dot product instruction in CLDepthwise QASYMM8 3x3 NHWC non-unit stride
but I have not seen any benefit (maybe because we have few arithemtic operation and we
do not have more load instructions). However Depthwise convolution has been improved by
30%
Change-Id: Id768a99c2e53a04276707e427af5d0ec93419ada
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155082
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Skipped im2col in CLGEMMConvolutionLayer for 1x1 convolutions with NHWC data layout
Change-Id: I894e6b952ed8605e8f3ffc0ffc25c24730d4664c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141909
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|