aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-11-02COMPMID-834 Fix arm_compute_nightly_validation getting killedMichalis Spyrou
Changed CLReductionOperationKernel: Now each kernel computes a 2D slice instead of 1D. This reduces the memory footprint from around 1.6Gb for a 4k input image to a few Mb, which was caused by the __local memory and was probably the cause for this bug. Change-Id: I71ac71ff09b041c945a134177600f0f3475e48cf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117835 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-848 NEPoolingLayerKernel incorrectly reportsMichalis Spyrou
it supports asymmetric padding Add asymmetric padding support for NEPoolingLayer Change-Id: Ia5cc660aeca636c3c45df4916a28974cc2b7f2f4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117275 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-748 - Integrating optimized SGEMM for bifrostGian Marco
This patch introduces a new GEMM capable to improve the mac utilisation of 10% compared to the GEMM without reshape. However this implementation is not faster in all cases as we need to take into account the time for reshaping the matrices. For this reason an heuristic solution to select the optimal GEMM to use has been added to the function. More information about the heuristic implementation can be found at COMPMID-852. With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can improved the performance of 1.5x. More information about the performance uplift can be found here: https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02 Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fixed output accessor in LeNet example, and disabled colors ↵Anthony Barbier
when not running in a terminal Change-Id: I4ec90803c5dc41b0cee05c36113ae3f189564d58 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117831 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-564: Implement reference and CL/NEON validation for ↵Sanghoon Lee
CustomConvolution (output S16) Change-Id: Ic099336f558e994210a59e14ec0171fae68ccb80 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116663 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Added missing <errno.h> includeAnthony Barbier
Change-Id: I25424481ddbbeb43f940cf51cef791e4fd83ea92 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117676 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-860: Neon HGEMM integrated assembly kernel from RSH for Arm ↵Pablo Tello
Cortex-A55r1. Change-Id: I640ae54dcc4591915c7a539b27728f05b70cf0eb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117616 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-798 Add instrumentation to NEON kernelsAnthony Barbier
Change-Id: I9dbb090cac731d68bd98a7d1c8ab0e1cb0a5c911 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116746 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Enable fp16 extension for arm64-v8.2-aPablo Tello
Explicitly add -march=armv8.2-a+fp16 for target arm64-v8.2-a, otherwise __ARM_FEATURE_FP16_VECTOR_ARITHMETIC is undefined and all the FP16 neon code is not compiled. Change-Id: I698819d842de996c1b4c88ebd0cf8664c5f70d58 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117601 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-816 - Optimizing CLGEMMLowpMatrixMultiplyCore - Part1Gian Marco
The performance improvements have been reported at the following confluence page: https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02 Config3 of McVail looks improved by 29x Change-Id: I8b203c0b75fc368f85cea863b7eed398fab3e79a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115783 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-842: Add NEON QASYMM8 RELU ActivationMichele Di Giorgio
Change-Id: I7197d2ad7ac08112eba1570a257ad011b1ce0b75 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117404 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-858: Assert in ICLKernel on higher window dimensions moved to enqueueAnthony Barbier
Change-Id: I49d501e82f5c69b6912cb9e5fa684a904c62ed8e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117409 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-841: Add CL QASYMM8 RELU ActivationMichele Di Giorgio
Change-Id: I8e0b7cad2f977942224d0116e8498bf9b2d6014d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117229 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765 - Add issue_template for githubMichalis Spyrou
When someone creates a new issue on github, he/she will see this standarized template. This is a way for users to provide some usefull information that they sometimes forget. Change-Id: I090733e621d1f9c8059f88298981279b4d304ac3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117098 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-857 ARMCV Failure to Build on RHEL platformMichalis Spyrou
Change-Id: I134cdfcee3cfc39d122d21038666021d1989dea1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117348 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-847: Add MobileNet_v1_0.75_160.Georgios Pinitas
Change-Id: Ib21de61fe39d2768638af11c067dfc7bcf63aae2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117112 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-784: Doxygen fixesPablo Tello
Change-Id: I35f429fbf08dece7c759242c37e0a68b0851ce49 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117231 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Updated changelog before 18.01Anthony Barbier
Change-Id: I0ec722803e8c32c0e284f219e996d7e60bc0d82e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117192 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765 UPDATE_DATE Switching to use mpd-repository for dataAnthony Barbier
Change-Id: If19b20ed94c16e7d5a5a0f1b82b49a62ea1d60e9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117171 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02APPBROWSER-377: GCConvoutionLayer support for FP16Stephen Li
Change-Id: I801b5e393a16a9f92c062826e6fcfd5982ca7bb3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116584 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02 COMPMID-847: DATA_UPDATE Add MobilenetV1 224,160 data.Georgios Pinitas
Change-Id: Ia00a594cc2621065fe93514cc740f61ff187ec7d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117114 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-815: Updated NEWinogradLayer with the lastest code from Research.Pablo Tello
Change-Id: I86d7f53b5f5d1dbc22078aea5c32b08a25d1f49e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116634 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-719: NEPermuteKernel refactoringPablo Tello
Change-Id: I91b43d9706ac3244ce43684967ace0b022d35bad Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114988 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-838 Implement CLPermuteMichalis Spyrou
Change-Id: I6d97b649f1ebc289c9e6f8949e67740a6b3cbcb2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116636 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-674 - Create Google InceptionV3 exampleGeorgios Pinitas
Change-Id: I389e0d4104b7dde60b7cdd612a83f3328517e44c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115804 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-791: Adds support of QASYMM8 in NEDepthwiseConvolution3x3Georgios Pinitas
Change-Id: I1a9ed6c3420ddf8978aeaad48d9915333b006b49 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116374 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-849: Changed default toolchain to Clang on AndroidAnthony Barbier
Change-Id: I345aa8455f53980b6e17c0963a8b593a1dbe38be Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116764 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02IVGCVSW-847 Fix {NEON/CL}PoolingLayerKernel configDiego Lopez Recas
Also, add validation test that hits the discovered failure for CL. Change-Id: I5573e0a3f169b85d5fb7299e7c48d74be7165208 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112717 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-753 Add benchmarks for ActivationLayers used in MobileNetGiorgio Arena
Change-Id: Iafc16409430274d5126f0fb054b0de5de6b6ca8f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116635 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-751 QASYMM8 ActivationLayer optimisation: don't requantize if not ↵Giorgio Arena
necessary Change-Id: Iea8a21f7c71025bfde6fdf7c7a7c92ba749b189b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116673 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-832: Clean up tests.Georgios Pinitas
Removes QS8 and QS16 tests from benchmarks. Change-Id: Idf82d33159b2066d50ac2d454140938e43160779 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116626 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-751 Processing 8 elements makes computation up to 80us faster on ↵Giorgio Arena
MobileNet QASYMM8 dwc layers Change-Id: I30eaea3f3625086e311ad201ef73a8f06a01e382 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116521 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-839 Added method to clear the Kernel Library's program cacheAnthony Barbier
Change-Id: If2e14c19f16686a2a8e05832845f8bfcf0f0cdaf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116537 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-835: Valgrind make UNIT/Utils/RoundFloatToNearestUp fail on aarch64Pablo Tello
Workaround for Valgrind round() issue on aarch64. Valgrind's call to std::round(-4.500000) == -4.000000 instead of 5.00000. I think there is a bug in valgrind's code for aarch64 where the rounding mode is not properly setup and that's the reason why round to zero is used all the time. Change-Id: If8fbee98e022856fcc48e454f7afd447f1f193e9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116457 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-752 Creating an example for QASYMM8 MobileNetGiorgio Arena
Change-Id: Ic76b3b6adaff8c84ba4d2ca5283d9291c69344f0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114466 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-769 Add asymmetric padding support in NEON kernels.Michalis Spyrou
- NEDirectConvolutionLayer - NEDepthwiseConvolutionLayer3x3 Change-Id: Id4d7d17ee334639c059015a290b8fc34712706ee Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115430 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-816 - Enabled CLConvolutionLayer to use CLGEMM function insteadGian Marco
of CLGEMMMatrixMultiplyKernel kernel. Change-Id: If035fa3d1fb3ff4012442bcd908c370d21aa6657 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115990 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-830 Fix hang in arm_compute_benchmark NEONMichalis Spyrou
Problem seems to happen when calling clfinish inside the CLScheduler destructor. Removed destructor and now calling sync() in benchmarks main.cpp. Change-Id: Ibb36a0d19aa03349d291407a1fb8266dce3ec75b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116288 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-752: DATA_UPDATE Add MobilenetV1 QASYMM8 data.Georgios Pinitas
Change-Id: Icbb569acdfb5cd9d669341921d585297a5840bb3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116192 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 - Reduced to 4 the batch size for AlexNetGian Marco
This patch also removed QS8 AlexNet benchmarking for NEON and set the flag weights_reshaped to false for CL Change-Id: I8db21b007c3b25b870e9072f8e02e36d1c1281c9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115999 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-785: Add QASYMM8 support for pooling layerGeorgios Pinitas
Adds generic pooling case for QASYMM8 Change-Id: I37d38a92ca61651e915fbbbb6da88e180390b4ab Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115439 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fixed unused variable warningAnthony Barbier
Change-Id: I244954f748169cefcf71409bc9fdbc45de816ba5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115878 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-786: Remove all the fill() calls in benchmarksAnthony Barbier
Filling buffers with random data takes a significant amount of time and in most cases doesn't affect the performance We will therefore only keep fill() in the functions for which it matters Change-Id: Ica34fe09941f27d6f0417f33176847febf722bc3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115892 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-782 Port examples to the new formatMichalis Spyrou
Change-Id: Ib178a97c080ff650094d02ee49e2a0aa22376dd0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115717 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-565 - Implement reference and CL/NEON validation for ↵Sanghoon Lee
CustomConvolutionSeparable Change-Id: I81fae268d158aec882dbeadb5597dc9f7274d865 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115347 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-800: Display build_information() at the beginning of the runsAnthony Barbier
Change-Id: Iba1e2f021f19351edf849239d10fb9f3788a67c8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115743 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-753 Add benchmarks for GEMM/GEMMLowp used in AlexNetGiorgio Arena
Change-Id: Ie680065fe98c2fcdefad1fd5240f0a951df6e4cf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115779 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02IVGCVSW-863 calculate_max_window..() family takes ValidRegionDiego Lopez Recas
Change-Id: I91e39713ffa580e9d2213988ad3517a8a41bf4e8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114013 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-376: Work around for scale validation error.Frank Lei
Use "vec2 scale" instead of scale_x/scale_y to work around this issue. Change-Id: Ieae55327596fdb853d7b625262fec3a3a84f577c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115143 Reviewed-by: Joel Liang <joel.liang@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Frank Lei <frank.lei@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-742: Add image input support for Harris Corners testsAlex Gilday
Change-Id: I4833eec0734776d8683fe867bb4f4d827f1a2fb7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115503 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>