Age | Commit message (Collapse) | Author |
|
* Support added for ACL neon slice workload
* Utility function created to translate ArmNN slice layer params to ACL neon slice layer equivalent
* Neon slice layer tests added as per SliceTestImpl.hpp
Signed-off-by: josh minor <josh.minor@arm.com>
Change-Id: Id583465311879af139e8e977f16ed2280c937ac7
|
|
* Fixed numerous CTS/VTS failures related to Quantization
Signed-off-by: Mike Kelly <mike.kelly@arm.com>
Change-Id: If5c20256366e80b6b9bbc46b2a1c410a9b8c48e1
|
|
* Improve implementation of Guid Generator to separate the range of
Static Guid and Dynamic Guid
* Unit tests to ensure non-collision
Signed-off-by: Narumol Prangnawarat <narumol.prangnawarat@arm.com>
Change-Id: I4ad1a75ea0b1f37155da0decafb51fc5a61e4187
|
|
Signed-off-by: Jung Tae-young <tee.ty.jung@openedges.com>
Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com>
Change-Id: I1f0dfa4ca76e1c85a2b8fb5de12039a260224951
|
|
parser
Signed-off-by: Jung Tae-young <tee.ty.jung@openedges.com>
Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com>
Change-Id: I396792d4d59172cccb50d77de7e6b74977b289ed
|
|
* Add clipping parameter validation in LstmQueueDescriptor
* Related UnitTest
Signed-off-by: janeil01 <jan.eilers@arm.com>
Change-Id: I86ff81cacc0e1fff5b78a8d6c2dcbf9ff57e2272
|
|
* Set default capture period to 10mSec.
* Validate capture period in PeriodicCounterSelectionCommandHandler
pull it up to 10mSec if it is lower.
* Fix segmentation fault in GatordMock when receive thread closes.
Signed-off-by: Colm Donelan <Colm.Donelan@arm.com>
Change-Id: I9f7ddc70bd99c102c5baef872d28329976a4dc07
|
|
* Added call to SendTimelineMessageDirectoryPackage in the handler
* Updated the unit tests accordingly
* Refactored SendTimelinePacket to remove macro
Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com>
Change-Id: I7bb6f8575945b99a0e77ef30ecfe4dee3058669e
|
|
Dequantize
* Check for output data type as well as input data type when determining
whether we should attempt to fall back to FP32 if FP16 is not supported
* Override output type for Dequantize in IsLayerSupported() instead of
input type
* Updated original input type from FP16 to FP32 in InsertConvertFp32ToFp16LayersAfter()
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: Ic6477fd17cea5a91bd8bf9ae0cf836520897d5b7
|
|
DepthwiseConvolution on ACL backends
* This is a temporary measure that needs to be removed as soon as the
NEON and CL DepthwiseConvolution workloads will have added support
for per-axis quantization
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: I24eb285230293392a6ed50aece1101e5aed7f90e
|
|
* Added call to ISendTimelinePacket::SendStreamMetaDataPacket
* Added call to ISendTimelinePacket::SendTimelineMessageDirectoryPackage
* Added new StreamMetadataCommandHandler class to the mock Gatord service
* Updated code and unit tests
* Added include paths to the gatord mock target
Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com>
Change-Id: Ic6d200b513175884607b7c0563cbfa4942ff2fc6
|
|
* Refactored the WriteTimelineMessageDirectoryPacket function
* Added the stream header to the packet
* Updated decoders/parsers
* Updated unit tests accordingly
* Minor refactoring
Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com>
Change-Id: I58f15fde54adc6414ca9fd5fb8d6157cad867339
|
|
The current algorithm in SubgraphViewSelector has a bug that can lead to
it producing subgraphs which have a dependency cycle (see the newly
added test case 'ValidMerge' for a repro). It also fails to merge
subgraphs in some cases where it could, which leads to smaller subgraphs.
In the case of FSRCNN, the NPU cannot support these smaller subgraphs and
so this is blocking us from supporting that network.
This commit changes the algorithm to fix the dependency bug and
also make it so that subgraphs are merged in the cases that were missed
before. It also adds some unit tests to cover cases that were problematic
before, and to extend coverage for the new algorithm.
The new algorithm has two downsides compared to the previous one:
1. Disjoint subgraphs are not merged. This can never lead to a failed
compilation by the NPU and so I believe this is less of an issue than
the previous algorithm's "missed merges". This could however lead to a
runtime performance loss in some cases as the NPU will be unable
to parallelise as many operations. There are some unit tests that cover
this which I have disabled.
2. The performance is worse. I have spent some time analysing this and
for a graph with ~1000 layers the new algorithm takes 20ms vs. the
old algorithm's 4ms (on my desktop PC). I believe the performance is
still within acceptable limits. I also compared inception V3 (which was
the network which caused performance issues with the original version of
the splitting algorithm) and this new algorithm has not regressed there
(200-300us in both cases).
Change-Id: I1dd64a779f272723621e04d203b5a2752a6af2ef
Signed-off-by: Robert Hughes <robert.hughes@arm.com>
|
|
The default version of message("...") print to stderr, which is inappropriate
for informational messages such as the ones we are printing in these cases.
Using message(STATUS "...") makes these messages appear on stdout instead
which is more appropriate.
Change-Id: I02f41e6b4948e6938566f06d7164444bd5b8199e
Signed-off-by: Robert Hughes <robert.hughes@arm.com>
|
|
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: Ia879f2d84a1b977474ee0dafa976f2aab32bd3ae
|
|
Change-Id: Ic2c0ce7a7a99bbc430b7d6da272825540772e01d
Signed-off-by: Derek Lamberti <derek.lamberti@arm.com>
|
|
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: I8f698c6ec9826ce1188bc43bd59fbf7b83455c1a
|
|
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: I263c78e02238fa7c7f9ab6408fb197664e5fe048
|
|
Change-Id: I1f694be7ef1d333b5ef9b60ea7029454ade02628
Signed-off-by: Derek Lamberti <derek.lamberti@arm.com>
|
|
* Replace use of non-standard integral types (e.g. u_char)
* Convert boost::filesystem::paths to std::strings using the .string()
method rather than .c_str(), because on Windows .c_str() returns a wide
character string, which is not convertible to a std::string.
Change-Id: Ia86b0653697033bb1afa01e64b5b2103dd042ffd
Signed-off-by: Robert Hughes <robert.hughes@arm.com>
|
|
* Enabled for Float32 only, as per support in ACL.
Signed-off-by: James Conroy <james.conroy@arm.com>
Change-Id: I251fc832e3058d389ee9bef96856baff89ba6f9a
|
|
* Enabled RefLayerTests for Signed32
Signed-off-by: Francis Murtagh <francis.murtagh@arm.com>
Change-Id: Idbe6fb7607c7e44a8df560b55f28c64a4c4286cd
|
|
* Also enabled copy to/from CL for Signed32.
Signed-off-by: James Conroy <james.conroy@arm.com>
Change-Id: I0113182891f9767de73f04dcd81252c84c996eda
|
|
* Add is_initalised() check to CLScheduler in
ClContextControl.
* Now use CLDepthwiseConvolutionLayer instead of
CLDepthwiseConvolutionLayer3x3.
* Now use NEDepthwiseConvolutionLayer instead of
NEDepthwiseConvolutionLayerOptimized.
!android-nn-driver:2212
Signed-off-by: James Conroy <james.conroy@arm.com>
Change-Id: I509af65315a4322dc820a5cc1bbd36ed6999b4a7
|
|
!android-nn-driver:2260
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: Iad93c1940568ffa65ed314c8871ea66caf4f9e4a
|
|
Added ProfilingGuid to
* INetwork,
* Network,
* IOptimizedNetwork and
* OptimizedNetwork
!android-nn-driver:2234
!armnn:2250
Signed-off-by: Jan Eilers <jan.eilers@arm.com>
Change-Id: I235116992cc47b4f385b7eb9da514c6350ca00f4
|
|
TransposeConvolution2d
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: Ie0dc1204eee925adfb1e59aba3f1137178302184
|
|
* Fix input data to allow for loss of precision due to valgrind which
causes incorrect quantization of multiples of 5 with scale of 2.
Signed-off-by: Francis Murtagh <francis.murtagh@arm.com>
Change-Id: I354dcb8117e1ab07771b78d0e4808d9f3f95925b
|
|
* Teporarily return false from IsConvolution2dSupported() whenever the
weights tensor has per-axis quantization in order to avoid exceptions
being thrown from ACL during attempted execution
* Should be reverted once per-axis quantization support will have been
added to the ACL backends
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: Ie2e1a7f3f5550a4b43f7f007ee5c86a8760872eb
|
|
* Refactoring to enable ProfilingGuid
* Add profiling includes to Android.mk
Signed-off-by: Jan Eilers <jan.eilers@arm.com>
Change-Id: Ieb25e15e3dc302eb42817d824ad8411ac76dcfe8
|
|
* Temporarily handles cases in CalculateEdgeStrategy
where dstFactory pointer is null when import is
disabled.
* This patch is required for ensuring debug layer
works correctly when executing a model on Neon.
Signed-off-by: James Conroy <james.conroy@arm.com>
Change-Id: I7304723246d362d6d9073c3d0b1224e194a8532c
|
|
* Added support for QuantizedSymm8PerAxis to ArmComputeTensorUtils.
Signed-off-by: Mike Kelly <mike.kelly@arm.com>
Change-Id: Ib8662f216bc4b6b54e0099780f73bcf6ef05384b
|
|
Signed-off-by: Finn Williams <Finn.Williams@arm.com>
Change-Id: I2da66efca40bc21d417efc42a225877d94e31428
|
|
* Replace predefined file name with randomly generated file name to
avoid reading back old dumps
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: Ia48a9cda4527c585453383a5d758e1831c38604a
|
|
* Moved ProfilingGuid to Types.hpp
* Refactoring to enable ProfilingGuid
Signed-off-by: janeil01 <jan.eilers@arm.com>
Change-Id: Ibf77002d74e484f8a63ffd96aa14303c1f0d38ae
|
|
* Added ITimelineDecoder.h C interface
* Added an example implementation of ITimelineDecoder.h
* Added command handlers for the timeline directory and objects
* Added tests for the decoder implementation
* Changed ReadSwTraceMessage to take a const unsigned char*
so it can be used by the directory command handler
* Fixed some bugs in ProfilingUtils.cpp and related tests
Change-Id: If06faf1fe0274a8f022f194a6d3527f5ce5374c6
Signed-off-by: Finn Williams <Finn.Williams@arm.com>
|
|
This was a bug that meant invalid dot files were produced due to MemCopy
layers having a name including "->".
Change-Id: If9f5b13d433f6a7328bf0ad8c7ec89cdce2462b0
Signed-off-by: Rob Hughes <robert.hughes@arm.com>
|
|
* Added RecordEvent utility function to the TimelineUtilityMethods
class
* Added new utility function to get a timestamp
* Added unit tests
Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com>
Change-Id: Ia3f8fe7397915fa6c903ce0c0abab3047cea628c
|
|
Convolution2d workload
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: I0ac08ba4864d48e6f64c4ac645dad8ea850be112
|
|
* Add conversion method to reverse bits in Shrink_Axis_Mask
* Add Unit tests for Neon, CL and Reference backends
* Fix supportedness of constant layer which is causing error
in DeepSpeech Uint8
* Also convert the Begin_Mask and End_Mask
Change-Id: I448b083c3463558e8fb5204923ab554cd43264ba
Signed-off-by: Francis Murtagh <francis.murtagh@arm.com>
|
|
* Add FileOnlyProfilingConnection Decorator
* Fix bug where Conn Ack not automatically sent back
* Modify GatordMock to use the Counter Directory class.
* Promote DirectoryCaptureCommandHandler from GatordMock into ArmNN.
* Remove MockUtils as it's contents were moved or deleted.
* Rewrite GatordMockTests to use Counter Directory class.
* Flush streams in ProfilingConnectionDumpToFileDecorator::Close.
Signed-off-by: Keith Davis <keith.davis@arm.com>
Signed-off-by: Colm Donelan <Colm.Donelan@arm.com>
Change-Id: I77b2aedece24150dd31691b577f3b5d81b2e226f
|
|
This parameter can contain both errors and warnings, so calling it errMessages is confusing as the user only expects to see errors here.
Ideally this rename should be propagated to the lower layers of the implementation,
but the public header change is the most useful part.
Change-Id: I062564cf38d36f950adfa7c37c090b189e068134
|
|
* Using std::thread::id as a general data type for thread id
* Added new profiling util functions for reading/writing a thread id
to/from a buffer
* Fixed code and unit tests accordingly
Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com>
Change-Id: I1aaa3bdb740c8a97010f655b1e9f7581b52e7aff
|
|
* Added convenience "using" statement for the unique pointers to
IPacketBuffer
* Replaced all the occurrencies in the code
Signed-off-by: Matteo Martincigh <matteo.martincigh@arm.com>
Change-Id: Iffec3a425ffbc1ecb23012971563a48139eb32eb
|
|
* Added ScaledInt32PerAxisDecoder implementation
* Added new case for Signed32 in MakeDecoder that returns a
ScaledInt32PerAxisDecoder if the tensor info has multiple
quantization scales
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: I8b3c11091644da993044d2a0fe2aba6b06b5af56
|
|
Signed-off-by: Aron Virginas-Tar <Aron.Virginas-Tar@arm.com>
Change-Id: I0bb0e9da306eee3e19dc9967a6c8bb01da998deb
|
|
* Add QuantizedSymm8PerAxis to armnn DataType (types.hpp) and
* Add Quantize and Dequantize template for int8 in TypeUtils to be able to compute QSymm8 of the weight
* Create PerAxisIterator for per-axis quantization
* Create QSymm8PerAxisDecoder
* Create QSymm8PerAxisEncoder
Signed-off-by: Keith Davis <keith.davis@arm.com>
Change-Id: Ibcfe0288a197b7ee50b543bdbd77b7edb8a547c2
|
|
* Changed RefDequantizeWorkload to use Encoder/Decoder
* Added related unit tests for Cl, Neon and Ref
Signed-off-by: Jan Eilers <jan.eilers@arm.com>
Change-Id: Ic2fd4103090dd2127c6859b49305736f7b2dfb05
|
|
Signed-off-by: Jung Tae-young tee.ty.jung@openedges.com
Change-Id: I44d24b525b78b8d3fee0197abda7bd667eb04d83
|
|
* armnnOnnxParser makes tensorInfo from graph->value_info
but PyTorch does not add weight/bias tensor information to graph->value_info
so tensorInfo of const tensor should be extracted from graph->initializer
Signed-off-by: Jung Tae-young tee.ty.jung@openedges.com
Change-Id: Ib2656dd25abc522012cf413e843fe03949cb2eb0
|