aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/NEON/functions/NEGEMM.cpp
AgeCommit message (Collapse)Author
2019-04-08COMPMID-2098: Scope handling of memory group resources.Georgios Pinitas
Change-Id: Ie945526bd7845301458039edf3129253c1808505 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/938 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-02-19COMPMID-2006: NEON GEMMLowp assertion failure.Georgios Pinitas
Mark auxilary tensors as resizable when cloning. Change-Id: I582e6d09a7daadbc43cf02f46a53e51c178daacb Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/726 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-02-18COMPMID-2005: Tensors have different quantization information.Georgios Pinitas
NEGEMMInterleavedKernel quantization information were unset when using the respective convolution functions. Change-Id: I1fd2af33aaadcfe3eda8bf20a0e56afa9b77c9bb Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/722 Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-11-13COMPMID-1751: Remove output_3d_depth from ↵Georgios Pinitas
NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint Change-Id: I1d5bc4d24059917f9ddef0873dd3043b1f2320a8
2018-11-09COMPMID-1626: Fixed VGG 16/19 bad_alloc failure.Pablo Tello
Some systems don't have enough memory to run the VGG networks, for example on systems with only 2GB memory the VGG example fails throwing a bad_alloc exception. This patch introduces the concept of global memory policy in ACL, the policy is a mechanism which could be used by the library's functions to try to reduce memory consumption on systems with limited memory. In this specific case the VGG examples set the policy to MINIMIZE. The GEMM function checks if the policy is MINIMIZE and in this case does not use the pretransposed weights path as this requires considerable more memory. Change-Id: I53abc3c9c64d045d8306793ffc9d24b28e228b7b
2018-11-08COMPMID-1736: Fixed out-of-bound write in CLIm2ColGian Marco Iodice
The issue was related to CLIm2Col when the number of input channels was less than the number of elements processed by each thread. The bug has been fixed in the validate_and_configure_window() function setting the correct number of elements accessed in the output tensor. Also fixed an issue GEMM3D when we have a single output channel Change-Id: I094292d0c7662599c4a4c3916ec5f5821df5faef
2018-11-02COMPMID-1469: Add validate in NEGEMMMatrixAdditionKernelGeorgios Pinitas
Change-Id: I228e2503eb40c12869fbd7e834ac1309aa613480 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145878 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-872 - Rework NEGEMMConvolutionLayer to use NEGEMMGian Marco Iodice
Change-Id: I55f0018ac7214775ebbca63f58a3bf5c93732fec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1401 Implement NEFullyConnectedLayer for QASYMM8Giorgio Arena
Change-Id: I0404df6d369855e2f458f2db8f26e81c80a1ee87 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140148 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1381: Cleaned up the AssemblyHelper interfaceAnthony Barbier
Introduced a new IFunction for when we'll fork the arm_gemm functions Increased encapsulation and abstraction of which method is used Change-Id: I5fd8b14b5c77e7f8ecb09029b5e2eccd10dbdcf4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139108 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-970 : Remove QS8 / QS16 supportVidhya Sudhan Loganathan
Removed fixed point related code. Change-Id: I487acf138dace3b0450e0d72ca7071eaec254566 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137678 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1145: (API) Introduce prepare() stage (NEON/CL/GLES)Georgios Pinitas
Change-Id: I5b46764f9c3154ec3e3b9c951cc9e6dfbcb81dfb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134255 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-959: Perform pretranspose if allowed on NEON assemblyGeorgios Pinitas
Change-Id: I281699ce7270aec1317c47b5a13799954cf6c9e8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130010 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-797 Integrate Mobilenet QASYMM8 with new graph.Giorgio Arena
Change-Id: I4df63ec2f4eb27a8a6eec2bea27741bf8dec6910 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126966 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1021: CPUInfo refactoring.Pablo Tello
Removed CPUTarget in favor of the CPUModel type. CPUInfo now holds a vector of N CPUs. CPUInfo autoinitialise upon construction with 1 GENERIC CPU. CPPScheduler fills CPUInfo's vector upon construction (runtime). IScheduler has a single CPUInfo obj and ThreadInfo always gets a pointer to it (avoid copying the vector) Change-Id: I30f293258c959c87f6bac5eac8b963beb6a4d365 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124626 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-881: RSH new arm_gemm interface.Pablo Tello
Change-Id: I1e2a1a77097d8017c274af3f97eba6964f80f5fa Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122592 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-617: Add validate support for NEON FullyConnectedLayerIoan-Cristian Szabo
Change-Id: I08987022c8d4cc335c00b8af27bd3edb8fe64d3b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111596 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Alexander Gilday <alexander.gilday@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-866: Integrate SGEMV Neon Assembly from RSHMichele Di Giorgio
Change-Id: Icbb43de7642e2b433d7471d70b9dbbde850989d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118197 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-765: Allow RSH's code to not have default cases in their switchesAnthony Barbier
Change-Id: I2d3cc9668852a1ba414fc3148866df408f770dc8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118308 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-759 - CLGEMM optimization for McVail benchmarksGian Marco
This patch introduces an optimization for CLGEMM on Bifrost architectures which can bring to 40% of FMA utilization on config 3 of McVail. The new CLGEMM does not require any reshape of matrix A and matrix B. This patch also adds the auto-config in CLConvolutionLayer and CLGEMM and extends the interface for NEGEMM and CLGEMM. Change-Id: Ibb354eda45e9ca64b14a99700fb21dff5989dda9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113716 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-750: Fix assembly kernel interfacesGeorgios Pinitas
Assembly kernel interfaces were wrongly translating the layout of the input matrices. Boolean flags transform0 and transform1 do not match the actual interface of the gemm assembly code which expects transpose0 and transposed1. Change-Id: Ia4df65a533834647fa63e78e8c897924793949df Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113410 Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-677: Integrate HGEMM assembly kernel (generic CPUs)Pablo Tello
Change-Id: I39abf367fe7ea1a54475e2ac0ecec12e90806899 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95378 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-662: Integrated the new a64_s8_gemm_12x8 + dot product kernel into ACL.Pablo Tello
Change-Id: Id8f919e486a132fc58346c9f84fccbeeb83d19b3 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94233 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-481: Add AArch32 GEMMMoritz Pflanzer
Change-Id: Idba0b30bfb27866a46a22388014ab81432ea28dc Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86196 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-481: Add AArch64 GEMMMoritz Pflanzer
Change-Id: I34f94f99cb05f0eabafee13c5e623ee779b72360 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83741 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-534: Add MemoryManager support in NEON functionsGeorgios Pinitas
Adds support for: -NECannyEdge -NEConvolution -NEDirectConvolution -NEGEMM -NEGEMMLowp -NEGaussian5x5 -NEHOGDescriptor -NEHOGGradient -NEL2Normalize -NELocallyConnectedLayer -NENormalizationLayer -NEScale -NESobel5x5 -NESobel7x7 Change-Id: I68e05aa6054372fa873a882633a15fb97882c00d Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87926 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-433 - Port NEGEMM to support 16 bit fixed pointGian Marco Iodice
Change-Id: I82de74d7027bbc8a00a4d6671e968785280d5f6c Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79498 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-418 Add check and fix comments after preprocessor conditionsAnthony Barbier
Change-Id: I1353fd652ee180e3931e58b4ce13d651a48c7e2c Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79567 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-411 - Port CLGEMM to support 8 bit fixed pointGian Marco Iodice
Change-Id: I6c8bd69ae9715e4d83d128b2162fc15aa5561afb Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78804 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-344 Updated doxygenAnthony Barbier
Change-Id: I32f7b84daa560e460b77216add529c8fa8b327ae