aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-15Fix possible crash in case of zero dimension tensor in the ONNXTee Jung
parser Signed-off-by: Jung Tae-young <tee.ty.jung@openedges.com> Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com> Change-Id: I396792d4d59172cccb50d77de7e6b74977b289ed
2019-11-15IVGCVSW-3486 Add clipping parameter validation in LstmQueueDescriptorjaneil01
* Add clipping parameter validation in LstmQueueDescriptor * Related UnitTest Signed-off-by: janeil01 <jan.eilers@arm.com> Change-Id: I86ff81cacc0e1fff5b78a8d6c2dcbf9ff57e2272
2019-11-15IVGCVSW-4129 Fix thread starvation due to low capture periodsColm Donelan
* Set default capture period to 10mSec. * Validate capture period in PeriodicCounterSelectionCommandHandler pull it up to 10mSec if it is lower. * Fix segmentation fault in GatordMock when receive thread closes. Signed-off-by: Colm Donelan <Colm.Donelan@arm.com> Change-Id: I9f7ddc70bd99c102c5baef872d28329976a4dc07
2019-11-15IVGCVSW-4074 Send Timeline message in RequestCounterDirectoryCommandHandlerMatteo Martincigh
* Added call to SendTimelineMessageDirectoryPackage in the handler * Updated the unit tests accordingly * Refactored SendTimelinePacket to remove macro Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com> Change-Id: I7bb6f8575945b99a0e77ef30ecfe4dee3058669e
2019-11-15IVGCVSW-4119 Fix FP16 to FP32 fallback mechanism in optimizer to work with ↵Aron Virginas-Tar
Dequantize * Check for output data type as well as input data type when determining whether we should attempt to fall back to FP32 if FP16 is not supported * Override output type for Dequantize in IsLayerSupported() instead of input type * Updated original input type from FP16 to FP32 in InsertConvertFp32ToFp16LayersAfter() Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: Ic6477fd17cea5a91bd8bf9ae0cf836520897d5b7
2019-11-15IVGCVSW-4140 Report per-axis quantization as unsupported for ↵Aron Virginas-Tar
DepthwiseConvolution on ACL backends * This is a temporary measure that needs to be removed as soon as the NEON and CL DepthwiseConvolution workloads will have added support for per-axis quantization Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: I24eb285230293392a6ed50aece1101e5aed7f90e
2019-11-15IVGCVSW-4073 Send stream info in the ConnectionAcknowledgedCommandHandlerMatteo Martincigh
* Added call to ISendTimelinePacket::SendStreamMetaDataPacket * Added call to ISendTimelinePacket::SendTimelineMessageDirectoryPackage * Added new StreamMetadataCommandHandler class to the mock Gatord service * Updated code and unit tests * Added include paths to the gatord mock target Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com> Change-Id: Ic6d200b513175884607b7c0563cbfa4942ff2fc6
2019-11-15IVGCVSW-4072 Add stream header to Timeline Message Directory packetMatteo Martincigh
* Refactored the WriteTimelineMessageDirectoryPacket function * Added the stream header to the packet * Updated decoders/parsers * Updated unit tests accordingly * Minor refactoring Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com> Change-Id: I58f15fde54adc6414ca9fd5fb8d6157cad867339
2019-11-15NNXSW-1853 Change SubgraphViewSelector algorithmRob Hughes
The current algorithm in SubgraphViewSelector has a bug that can lead to it producing subgraphs which have a dependency cycle (see the newly added test case 'ValidMerge' for a repro). It also fails to merge subgraphs in some cases where it could, which leads to smaller subgraphs. In the case of FSRCNN, the NPU cannot support these smaller subgraphs and so this is blocking us from supporting that network. This commit changes the algorithm to fix the dependency bug and also make it so that subgraphs are merged in the cases that were missed before. It also adds some unit tests to cover cases that were problematic before, and to extend coverage for the new algorithm. The new algorithm has two downsides compared to the previous one: 1. Disjoint subgraphs are not merged. This can never lead to a failed compilation by the NPU and so I believe this is less of an issue than the previous algorithm's "missed merges". This could however lead to a runtime performance loss in some cases as the NPU will be unable to parallelise as many operations. There are some unit tests that cover this which I have disabled. 2. The performance is worse. I have spent some time analysing this and for a graph with ~1000 layers the new algorithm takes 20ms vs. the old algorithm's 4ms (on my desktop PC). I believe the performance is still within acceptable limits. I also compared inception V3 (which was the network which caused performance issues with the original version of the splitting algorithm) and this new algorithm has not regressed there (200-300us in both cases). Change-Id: I1dd64a779f272723621e04d203b5a2752a6af2ef Signed-off-by: Robert Hughes <robert.hughes@arm.com>
2019-11-15Print CMake messages on stdout rather than stderrRob Hughes
The default version of message("...") print to stderr, which is inappropriate for informational messages such as the ones we are printing in these cases. Using message(STATUS "...") makes these messages appear on stdout instead which is more appropriate. Change-Id: I02f41e6b4948e6938566f06d7164444bd5b8199e Signed-off-by: Robert Hughes <robert.hughes@arm.com>
2019-11-15Add FP16 support to DebugWorkloadAron Virginas-Tar
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: Ia879f2d84a1b977474ee0dafa976f2aab32bd3ae
2019-11-15Only enable mixed precision FP16 pooling for Android QDerek Lamberti
Change-Id: Ic2c0ce7a7a99bbc430b7d6da272825540772e01d Signed-off-by: Derek Lamberti <derek.lamberti@arm.com>
2019-11-14Fix redundancy in call to configure() in ACL DepthwiseConvolution workloadsAron Virginas-Tar
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: I8f698c6ec9826ce1188bc43bd59fbf7b83455c1a
2019-11-14Add SpaceToDepth to GetLayerTypeAsCString()Aron Virginas-Tar
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: I263c78e02238fa7c7f9ab6408fb197664e5fe048
2019-11-14CL & Neon workload factories inherit from WorkloadFactoryBaseDerek Lamberti
Change-Id: I1f694be7ef1d333b5ef9b60ea7029454ade02628 Signed-off-by: Derek Lamberti <derek.lamberti@arm.com>
2019-11-14Fix link error due to pthread being linked in the wrong orderRob Hughes
Change-Id: I9602c758fe462b65d67de491d91fb2392b09b8bd Signed-off-by: Robert Hughes <robert.hughes@arm.com>
2019-11-14Fix a few compile errors:Rob Hughes
* Replace use of non-standard integral types (e.g. u_char) * Convert boost::filesystem::paths to std::strings using the .string() method rather than .c_str(), because on Windows .c_str() returns a wide character string, which is not convertible to a std::string. Change-Id: Ia86b0653697033bb1afa01e64b5b2103dd042ffd Signed-off-by: Robert Hughes <robert.hughes@arm.com>
2019-11-13IVGCVSW-3697 Add utility function to get ArgMinMaxFunction as stringFrancis Murtagh
* Allow logging of ConvertArgMinMax calls with specified Min/Max function which take place in HalPolicy Signed-off-by: Francis Murtagh <francis.murtagh@arm.com> Change-Id: Ic8c38106725023c864f7950dc9d0e2737485cfef
2019-11-13IVGCVSW-4053 Enable ArgMinMax EndToEndTest for NEON/CLJames Conroy
* Enabled for Float32 only, as per support in ACL. Signed-off-by: James Conroy <james.conroy@arm.com> Change-Id: I251fc832e3058d389ee9bef96856baff89ba6f9a
2019-11-13IVGCVSW-4128 Add Signed32 to supported input types for Ref ArgMinMaxFrancis Murtagh
* Enabled RefLayerTests for Signed32 Signed-off-by: Francis Murtagh <francis.murtagh@arm.com> Change-Id: Idbe6fb7607c7e44a8df560b55f28c64a4c4286cd
2019-11-13IVGCVSW-3695 Add CL ArgMinMax workloadJames Conroy
* Also enabled copy to/from CL for Signed32. Signed-off-by: James Conroy <james.conroy@arm.com> Change-Id: I0113182891f9767de73f04dcd81252c84c996eda
2019-11-12IVGCVSW-4051 Update ACL pin to 94e0cf960ea6116eb57fa88d9b951f859b52c602James Conroy
* Add is_initalised() check to CLScheduler in ClContextControl. * Now use CLDepthwiseConvolutionLayer instead of CLDepthwiseConvolutionLayer3x3. * Now use NEDepthwiseConvolutionLayer instead of NEDepthwiseConvolutionLayerOptimized. !android-nn-driver:2212 Signed-off-by: James Conroy <james.conroy@arm.com> Change-Id: I509af65315a4322dc820a5cc1bbd36ed6999b4a7
2019-11-12IVGCVSW-4079 Add support of per-axis quantization to DepthwiseConvolution2dTeresa Charlin
!android-nn-driver:2260 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Iad93c1940568ffa65ed314c8871ea66caf4f9e4a
2019-11-12IVGCVSW-4069 Add ProfilingGuid to NetworkJan Eilers
Added ProfilingGuid to * INetwork, * Network, * IOptimizedNetwork and * OptimizedNetwork !android-nn-driver:2234 !armnn:2250 Signed-off-by: Jan Eilers <jan.eilers@arm.com> Change-Id: I235116992cc47b4f385b7eb9da514c6350ca00f4
2019-11-12IVGCVSW-3839 Add support of per-axis quantization to reference ↵Aron Virginas-Tar
TransposeConvolution2d Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: Ie0dc1204eee925adfb1e59aba3f1137178302184
2019-11-11IVGCVSW-4064 ArmNN Master fails due to an error in RefArgMaxAxis2Uint8TestFrancis Murtagh
* Fix input data to allow for loss of precision due to valgrind which causes incorrect quantization of multiples of 5 with scale of 2. Signed-off-by: Francis Murtagh <francis.murtagh@arm.com> Change-Id: I354dcb8117e1ab07771b78d0e4808d9f3f95925b
2019-11-11IVGCVSW-4104 Report Conv2d per-axis quantization unsupported on ACL backendsAron Virginas-Tar
* Teporarily return false from IsConvolution2dSupported() whenever the weights tensor has per-axis quantization in order to avoid exceptions being thrown from ACL during attempted execution * Should be reverted once per-axis quantization support will have been added to the ACL backends Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: Ie2e1a7f3f5550a4b43f7f007ee5c86a8760872eb
2019-11-08IVGCVSW-4067 Change LayerGuid to use ProfilingGuidjaneil01
* Refactoring to enable ProfilingGuid * Add profiling includes to Android.mk Signed-off-by: Jan Eilers <jan.eilers@arm.com> Change-Id: Ieb25e15e3dc302eb42817d824ad8411ac76dcfe8
2019-11-08IVGCVSW-4077 Disable NEON memory importJames Conroy
* Temporarily handles cases in CalculateEdgeStrategy where dstFactory pointer is null when import is disabled. * This patch is required for ensuring debug layer works correctly when executing a model on Neon. Signed-off-by: James Conroy <james.conroy@arm.com> Change-Id: I7304723246d362d6d9073c3d0b1224e194a8532c
2019-11-08Add profiling includes to Android.mkjaneil01
Signed-off-by: janeil01 <jan.eilers@arm.com> Change-Id: I8f95846a461aa77c8cff4311f92368a419bbb28d
2019-11-08IVGCVSW-4108 Fixed invalid data type exceptionMike Kelly
* Added support for QuantizedSymm8PerAxis to ArmComputeTensorUtils. Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: Ib8662f216bc4b6b54e0099780f73bcf6ef05384b
2019-11-08MLCE-144 Fix cts MAX_POOL_2D_V1_0 testsFinn Williams
Signed-off-by: Finn Williams <Finn.Williams@arm.com> Change-Id: I2da66efca40bc21d417efc42a225877d94e31428
2019-11-07IVGCVSW-4107 Fix bug in ProfilingConnectionDumpToFileDecoratorTestsAron Virginas-Tar
* Replace predefined file name with randomly generated file name to avoid reading back old dumps Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: Ia48a9cda4527c585453383a5d758e1831c38604a
2019-11-07IVGCVSW-4102 Move ProfilingGuid to public interfacejaneil01
* Moved ProfilingGuid to Types.hpp * Refactoring to enable ProfilingGuid Signed-off-by: janeil01 <jan.eilers@arm.com> Change-Id: Ibf77002d74e484f8a63ffd96aa14303c1f0d38ae
2019-11-07IVGCVSW-3951 Create the timeline decoderFinn Williams
* Added ITimelineDecoder.h C interface * Added an example implementation of ITimelineDecoder.h * Added command handlers for the timeline directory and objects * Added tests for the decoder implementation * Changed ReadSwTraceMessage to take a const unsigned char* so it can be used by the directory command handler * Fixed some bugs in ProfilingUtils.cpp and related tests Change-Id: If06faf1fe0274a8f022f194a6d3527f5ce5374c6 Signed-off-by: Finn Williams <Finn.Williams@arm.com>
2019-11-07Escape angle brackets in dot file labelsRob Hughes
This was a bug that meant invalid dot files were produced due to MemCopy layers having a name including "->". Change-Id: If9f5b13d433f6a7328bf0ad8c7ec89cdce2462b0 Signed-off-by: Rob Hughes <robert.hughes@arm.com>
2019-11-06IVGCVSW-4065 Add a RecordEvent functionMatteo Martincigh
* Added RecordEvent utility function to the TimelineUtilityMethods class * Added new utility function to get a timestamp * Added unit tests Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com> Change-Id: Ia3f8fe7397915fa6c903ce0c0abab3047cea628c
2019-11-06IVGCVSW-3837 Add support for per-axis quantization to reference ↵Aron Virginas-Tar
Convolution2d workload Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: I0ac08ba4864d48e6f64c4ac645dad8ea850be112
2019-11-06IVGCVSW-4038 Convert Strided_Slice Shrink_Axis_Mask Parameter to ACL formatFrancis Murtagh
* Add conversion method to reverse bits in Shrink_Axis_Mask * Add Unit tests for Neon, CL and Reference backends * Fix supportedness of constant layer which is causing error in DeepSpeech Uint8 * Also convert the Begin_Mask and End_Mask Change-Id: I448b083c3463558e8fb5204923ab554cd43264ba Signed-off-by: Francis Murtagh <francis.murtagh@arm.com>
2019-11-06IVGCVSW-3444 File Only Profiling ConnectionKeith Davis
* Add FileOnlyProfilingConnection Decorator * Fix bug where Conn Ack not automatically sent back * Modify GatordMock to use the Counter Directory class. * Promote DirectoryCaptureCommandHandler from GatordMock into ArmNN. * Remove MockUtils as it's contents were moved or deleted. * Rewrite GatordMockTests to use Counter Directory class. * Flush streams in ProfilingConnectionDumpToFileDecorator::Close. Signed-off-by: Keith Davis <keith.davis@arm.com> Signed-off-by: Colm Donelan <Colm.Donelan@arm.com> Change-Id: I77b2aedece24150dd31691b577f3b5d81b2e226f
2019-11-05GitHub #294 Add sample serialized graphTee Jung
converted from tensorflow model: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v3_small_coco_2019_08_14.tar.gz Signed-off-by: Jung Tae-young tee.ty.jung@openedges.com Change-Id: I5867792f36a37d574ce44738d6a412c5557b97df
2019-11-05Rename Optimize's errMessages to messagesRob Hughes
This parameter can contain both errors and warnings, so calling it errMessages is confusing as the user only expects to see errors here. Ideally this rename should be propagated to the lower layers of the implementation, but the public header change is the most useful part. Change-Id: I062564cf38d36f950adfa7c37c090b189e068134
2019-11-05IVGCVSW-4065 Use platform-specific thread id size in Timeline packetsMatteo Martincigh
* Using std::thread::id as a general data type for thread id * Added new profiling util functions for reading/writing a thread id to/from a buffer * Fixed code and unit tests accordingly Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com> Change-Id: I1aaa3bdb740c8a97010f655b1e9f7581b52e7aff
2019-11-05IVGCVSW-4065 Refactor the IPacketBuffer smart pointersMatteo Martincigh
* Added convenience "using" statement for the unique pointers to IPacketBuffer * Replaced all the occurrencies in the code Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com> Change-Id: Iffec3a425ffbc1ecb23012971563a48139eb32eb
2019-11-05IVGCVSW-3836 Add support for Int32 per-axis scalesAron Virginas-Tar
* Added ScaledInt32PerAxisDecoder implementation * Added new case for Signed32 in MakeDecoder that returns a ScaledInt32PerAxisDecoder if the tensor info has multiple quantization scales Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: I8b3c11091644da993044d2a0fe2aba6b06b5af56
2019-11-05IVGCVSW-3843 Add support of per-axis quantization to BuildArmComputeTensorInfoAron Virginas-Tar
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com> Change-Id: I0bb0e9da306eee3e19dc9967a6c8bb01da998deb
2019-11-04IVGCVSW-3835 Create Encoder and Decoder for QSymm8PerAxisKeith Davis
* Add QuantizedSymm8PerAxis to armnn DataType (types.hpp) and * Add Quantize and Dequantize template for int8 in TypeUtils to be able to compute QSymm8 of the weight * Create PerAxisIterator for per-axis quantization * Create QSymm8PerAxisDecoder * Create QSymm8PerAxisEncoder Signed-off-by: Keith Davis <keith.davis@arm.com> Change-Id: Ibcfe0288a197b7ee50b543bdbd77b7edb8a547c2
2019-11-04Add fp16 support for dequantizeJan Eilers
* Changed RefDequantizeWorkload to use Encoder/Decoder * Added related unit tests for Cl, Neon and Ref Signed-off-by: Jan Eilers <jan.eilers@arm.com> Change-Id: Ic2fd4103090dd2127c6859b49305736f7b2dfb05
2019-11-04Make onnx parser to support TanH / Sigmoid / LeakyRelu layersTee Jung
Signed-off-by: Jung Tae-young tee.ty.jung@openedges.com Change-Id: I44d24b525b78b8d3fee0197abda7bd667eb04d83
2019-11-04Fix crash issueTee Jung
* armnnOnnxParser makes tensorInfo from graph->value_info but PyTorch does not add weight/bias tensor information to graph->value_info so tensorInfo of const tensor should be extracted from graph->initializer Signed-off-by: Jung Tae-young tee.ty.jung@openedges.com Change-Id: Ib2656dd25abc522012cf413e843fe03949cb2eb0