Age | Commit message (Collapse) | Author |
|
FP16/QASYMM8
When the GEMM3D check fails, now we fallback to the classic implementation with im2col
and col2im. In this manner the function can work with QASYMM8 and FP16
Change-Id: I359e9da3a63956f33b5acbc9bca4383b14af10e2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143372
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I5188a2163e7341f1915d98c21464fea13a9a7faf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143330
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
Change-Id: I4afb19751520a90fee27fb49b775cd10e92a94f5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140476
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
This causes problems when ACL is used as a shared library on Android.
Fixes some problems related to creation / destruction order between the Graph's CL backend and core / runtime
Change-Id: I716d63fd42f4586df1ffbb6fa97e4db06d3a781b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143228
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
CLWidthDepthConcatenateLayerKernel
Change-Id: Icab813cd432174608621ee6a87015aeb10ab822d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143570
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Changed RelativeTolerance to Asbsolute for F16/F32 as the values can
be very close to zero for large inputs.
Change-Id: Ibeab9f4e4d218e4ceaad00b1725acc34e80c7afb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143576
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change CLReductionOperation border to be multiple of 64 instead of 16.
The opencl kernel works only with local_size(0) being a power of 2. This will
generate a padding of 63 if input_width % 64 = 1, but I don't think it's a
big issue and it keeps the border calculation pretty simple.
Also, increased tolerance for fp32 because there were mismatches
for the 4K image.
Change-Id: Id44990a262b2d6eff4c8ce56eb7c886274d9847e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143415
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Wrong boundary condition in the im2col3x3_nhwc kernel
Change-Id: I83e9dd9b425fd0e3227decb1da3d08a3f5e2536d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143489
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
This changes help to prevent errors like passing a matrix
with less elements than required into the warp functions.
Change-Id: I863f933a5e0568258717cffed3a20788d3d03083
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143044
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Removing support for uint8_t (QASYMM8) in the reference function that accepts dst_data_type should be enough.
Change-Id: I46a43facf25463a8cbd3c5d5820c2cc06259ff10
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143399
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
QASYMM8
Fixed also a bug in the graph API related to the bias shape in DepthWiseConvolution for NHWC
Change-Id: I275141a42e51f6747b77db1c31d1bc69e8685af5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143454
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
The flag "ChannelsFirstOutputNHWC" was not set
Change-Id: Id5f64a839d4e86638a07090e971a4f7ee82af349
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143457
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ie26b78c9da635206c96111ea490ac565063838ba
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143408
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
- Reverse dimensions when loading a non-fortran order tensor
- Support saving tensors with arbitrary number of dimensions (Not just 2)
- Fixed a minor bug in SONAME generation
Change-Id: I36aa0b05c9d3568d1296da2d84d5e299b40459cc
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142794
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
+ validate() function
Change-Id: I6808de0254a7c4bca440322cc14b795b3b32465b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142427
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: If15e06ad3aa092d32c4d88172a9fea79a7416b2b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143128
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iaf8519bc483b947876a9b6ba83b4eb43b45b83a1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143135
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Iea248dca88828669b680aeacbbf2b359d2bed304
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143143
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
This patch includes:
- Im2Col optimizations for NHWC using a new data layout
- Refactoring of CLIm2ColKernel adding validation method and auto-init
- Removed im2col_reduced from CLIm2ColKernel and created a new kernel CLFlattenLayerKernel
Change-Id: I1620640b6796baa268324b33ae92cdd8de53e27c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141241
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
Also fixed the calculation of num_elements in access_numpy_tensor
Change-Id: Ic1a394ff829746d7803b81360830bade63b6b82a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143132
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Idde333308db71087ec234b3fd1eb4e36a44db46c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143049
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Without the check introduced by this patch, all weak edges as marked as strong
edges.
Change-Id: I874ebf22c06707bd98bd11b9be93602bfcbafa7c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142922
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
The previous implementation of GEMM3D degradated the performance when the
input had to be reinterpreted as 3D. However if both input and output have to be
reinterpreted as 3D, we can skip the offset calculation for that specific case
and run the multi GEMM approach
Change-Id: I0d5d48add2c6ccdebfbb268ea199dd181101f3aa
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142872
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I7bbab53f18a42f0879d80122a52bb6bdca4b8631
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142413
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
- Ported PrepareB kernel from gemm_interleave
- Ported TransformA feature from gemm_interleave
- Allocate reshaped a and b buffers
- Added memory_manager / memory_group
- MatrixMultiply kernel
- Interleave kernels execution.
- Fixed a few bugs: all nightly Convolution tests passing for threads=1
and threads=4
- Added Doxygen documentations and comments in the code
- Added support for all data types supported
Change-Id: Iffa1c09fda0bb9c61213bb83524d5a48e7ecb03c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141281
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I55f0018ac7214775ebbca63f58a3bf5c93732fec
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142632
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I986099c269498cc7971b10ee634dba721954546e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142647
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I3b8a6c00e61ba6da459ca5fc7275393f9d073aed
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142533
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iaa93a497e7913c27f2fd09e974125cda5f04bc4b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142463
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
We skipped im2col also without unit strides
Change-Id: I04c63a6dda8553b3890e832a56ff6854349c829a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142520
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I0d253e6047216cfbd57dc807881c2b24d82c47f5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142357
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I2d20cd3c5f83a9ba4e0de6659b255337877d5bbc
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142252
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I2240b6a6430cb1d261458343b2900cc1f16ac414
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141861
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Skipped im2col in CLGEMMConvolutionLayer for 1x1 convolutions with NHWC data layout
Change-Id: I894e6b952ed8605e8f3ffc0ffc25c24730d4664c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141909
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I2c1e69b4654e928d8e7e9071258194f258bb6935
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142368
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I0fa02b8cc9289cfc4c89bea3f2041db938204948
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142232
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I7670f79209a1e4439d57e05c1f5c576f600971cb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142299
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I0c155d0d8a56fc6610dc2476e669456c7d2cc87b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142068
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Alters the ending conditions for y dimension to use the actual end
offset as a bound and not the actual y window as this could be the whole
execution window and can lead to overlapped calculations across threads.
Change-Id: Ic6642bbaa8e85d4a4034a44234d6cb3347a2f4ff
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142229
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
-Enables cell-to-input weights when !cifg and peephole
-Makes projection bias conditional
Change-Id: Iee866db9f5d8479c2dfd95d74a2d42492bf07a8d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140543
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Les Bell <les.bell@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ief1b6df40623c9f304093cf1f188c86454da3f9c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141965
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Iba115d5df9d3b5802899318e2e68c33454731e33
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142251
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I5f2c198f7ac4d8996180e204e763ab53f5e7ea3d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142153
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Matteo Martincigh <matteo.martincigh@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ie9a6a896da142198243139fb9f8be0f83b87ccce
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142130
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I39e354327a87ebe838af9f1cd57b5800517cf7ea
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141964
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Apply offsets and strides to winograd transform functions in NEON.
Change-Id: Ia4f44d22244203a5f9d93d2fed73570396b0d28c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141803
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
is supported
Change-Id: I4c5121e0f000d5ee94a8c8c5326272806f643e35
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141520
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ief9b717fe2bcf626660109ec491f8882d0ef06d7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141658
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I429087f8aa436cf0877c3abec8fd7201bec1b81c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141661
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iabc54a3a1bdcd46a9a921cda39c7c85fef672b72
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141449
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|