Age | Commit message (Collapse) | Author |
|
Also, add validation test that hits the discovered failure for CL.
Change-Id: I5573e0a3f169b85d5fb7299e7c48d74be7165208
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112717
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Iafc16409430274d5126f0fb054b0de5de6b6ca8f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116635
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
necessary
Change-Id: Iea8a21f7c71025bfde6fdf7c7a7c92ba749b189b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116673
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Removes QS8 and QS16 tests from benchmarks.
Change-Id: Idf82d33159b2066d50ac2d454140938e43160779
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116626
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
MobileNet QASYMM8 dwc layers
Change-Id: I30eaea3f3625086e311ad201ef73a8f06a01e382
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116521
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: If2e14c19f16686a2a8e05832845f8bfcf0f0cdaf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116537
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Workaround for Valgrind round() issue on aarch64.
Valgrind's call to std::round(-4.500000) == -4.000000 instead of 5.00000. I think there is a bug
in valgrind's code for aarch64 where the rounding mode is not properly setup and that's the reason
why round to zero is used all the time.
Change-Id: If8fbee98e022856fcc48e454f7afd447f1f193e9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116457
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic76b3b6adaff8c84ba4d2ca5283d9291c69344f0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114466
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- NEDirectConvolutionLayer
- NEDepthwiseConvolutionLayer3x3
Change-Id: Id4d7d17ee334639c059015a290b8fc34712706ee
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115430
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
of CLGEMMMatrixMultiplyKernel kernel.
Change-Id: If035fa3d1fb3ff4012442bcd908c370d21aa6657
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115990
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Problem seems to happen when calling clfinish inside the CLScheduler
destructor. Removed destructor and now calling sync() in benchmarks
main.cpp.
Change-Id: Ibb36a0d19aa03349d291407a1fb8266dce3ec75b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116288
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Icbb569acdfb5cd9d669341921d585297a5840bb3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116192
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch also removed QS8 AlexNet benchmarking for NEON and set the
flag weights_reshaped to false for CL
Change-Id: I8db21b007c3b25b870e9072f8e02e36d1c1281c9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115999
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Adds generic pooling case for QASYMM8
Change-Id: I37d38a92ca61651e915fbbbb6da88e180390b4ab
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115439
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I244954f748169cefcf71409bc9fdbc45de816ba5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115878
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Filling buffers with random data takes a significant amount of time and in most cases doesn't affect the performance
We will therefore only keep fill() in the functions for which it matters
Change-Id: Ica34fe09941f27d6f0417f33176847febf722bc3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115892
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Ib178a97c080ff650094d02ee49e2a0aa22376dd0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115717
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
CustomConvolutionSeparable
Change-Id: I81fae268d158aec882dbeadb5597dc9f7274d865
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115347
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Iba1e2f021f19351edf849239d10fb9f3788a67c8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115743
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ie680065fe98c2fcdefad1fd5240f0a951df6e4cf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115779
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I91e39713ffa580e9d2213988ad3517a8a41bf4e8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114013
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Use "vec2 scale" instead of scale_x/scale_y to work around this issue.
Change-Id: Ieae55327596fdb853d7b625262fec3a3a84f577c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115143
Reviewed-by: Joel Liang <joel.liang@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Frank Lei <frank.lei@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I4833eec0734776d8683fe867bb4f4d827f1a2fb7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115503
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ie00c6b08a51d30c5ce2637d40ee3d165b8a68686
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110311
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
CustomConvolutionRectangle
Change-Id: I108a48ad5e6dc3f331fd5ceb38ced8ccdb31d81a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113130
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I180281e796e1670b9ad391d82d66ecde0119ef78
Note: this is for internal use only which is why I think the hackiness of RunExample.cpp is acceptable.
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115154
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: If72b649fce21d0b8b9c28a1b064c4cf5adb06c15
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115502
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
MobileNet
Change-Id: I72cdf54477838b01bc5fa1281b0b587646f1902b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115396
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Removed the code that created a subtensor and imported memory from the workspace in the function run() method.
The subtensor is no longer needed because we perform the reordering of the tensors with NEPermute. The call to the method
winograd::Winograd2x2_3x3GEMM<TOut, TIn>::reshape_output() will transform the results from the winograd domain
into the spatial domain and this will be stored in the member _output_nhwc.
Change-Id: Iae09d26c7587cd2eed98968c3ce214e20031038e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115483
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I23865486ef413c4d2495c537df4b1393b0e822df
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115395
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Enforce clFinish to be called on destruction of the CLScheduler to
ensure that no leftovers are in the queue which might lead to the retain
of the queue and its deferred destruction.
Change-Id: Ic71933f65cdccd74f4f01a6e2ec1a049995f5b50
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115389
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Remove token pasting operator support for GLES shader
Remove cs_shdaers/helpers.h (The old GLES shader common code)
Remove class BufferParam. We don't need to pass the buffer_data_type_shift to GLES shader.
Change-Id: Ic4fa6b2fb7647b8f69759f6077ae4a5b483cc04d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115448
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Frank Lei <frank.lei@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I62a7a1871b93fafc65eb58fa550bc86179bdffe7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112489
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I2021612e61de1b82aaeb49249d06929c7fceb15f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115216
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
This patch introduces an optimization for CLGEMM on Bifrost
architectures which can bring to 40% of FMA utilization on
config 3 of McVail. The new CLGEMM does not require any reshape of
matrix A and matrix B.
This patch also adds the auto-config in CLConvolutionLayer and CLGEMM
and extends the interface for NEGEMM and CLGEMM.
Change-Id: Ibb354eda45e9ca64b14a99700fb21dff5989dda9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113716
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I4aa3999159f0448592f5f704ebcd37b26f9b1e51
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115279
Reviewed-by: Joel Liang <joel.liang@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic2be14d626856faa4496c588154ef5cfb66d4e2c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115282
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Joel Liang <joel.liang@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Renamed BiasAccumulateKernel to OutputStage. If no bias is provided
when the input is quantized, the kernel simply downscales the input.
Throw error if no bias is provided and input is floating point.
Change-Id: I645a4ee9c6014b0547778fdd92c9ec72ef2f0aab
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114158
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I373e349ac35ff52ebcc895723d8aa61b754519d4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115283
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Joel Liang <joel.liang@arm.com>
|
|
Change-Id: Ie2f398d62dea97e9201f77d22c9f0796db297b63
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115280
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Zhenglin Li <zhenglin.li@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I717d0ebbae5102da039b9295649aed8056e4cdfd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114960
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Joel Liang <joel.liang@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: Idf452cfa0428a36f2d718a6d438d6e59897e1e99
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115061
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iecbfa3ebab890c778fb475403466d6fb168e9968
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113357
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I82a3ec133193433ba9ed3efcb49c51a2b95b16c0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114962
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Zhenglin Li <zhenglin.li@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I9db00c846fa7fc223a22ab775025dfdea587ade8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114957
Reviewed-by: Joel Liang <joel.liang@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
NEGEMMLowpAArch64V8P4Kernel
Change-Id: If32cbdc65f2e1441595cae5b4824a9b4357c8bf6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113467
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
ARM_COMPUTE_NO_EXCEPTIONS macro guard
Cherry-picked public merge request from Codeplay
Change-Id: Id819177fcc86a64dc4e82eefe46b2f646619e8c0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114924
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I051b7e56b60bf1a55cdf014539ef71346d3aee26
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114737
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Input reordering from NCHW to NHWC
Output reordering from NHWC to NCHW
Weights reordering from [Ofm x Ifm x Height x Width] to [Height x Width x Ifm x Ofm]
Change-Id: I85aabedb1f9c13700bc4919eb3130f4d4bd0b465
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113631
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ia71435f6e5c5610e2b76d6d4eb61a8847ca42305
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114829
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|