ArmNN
 22.08
ClBackend Class Reference

#include <ClBackend.hpp>

Inheritance diagram for ClBackend:
IBackendInternal IBackend

Classes

class  ClBackendCustomAllocatorMemoryRegion
 
class  ClBackendCustomAllocatorWrapper
 

Public Member Functions

 ClBackend ()
 
 ClBackend (std::shared_ptr< ICustomAllocator > allocator)
 
 ~ClBackend ()=default
 
const BackendIdGetId () const override
 
IBackendInternal::IMemoryManagerUniquePtr CreateMemoryManager () const override
 
IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory (const IBackendInternal::IMemoryManagerSharedPtr &memoryManager=nullptr) const override
 
IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory (TensorHandleFactoryRegistry &registry) const override
 
IWorkloadFactoryPtr CreateWorkloadFactory (const IMemoryManagerSharedPtr &memoryManager, const ModelOptions &modelOptions) const override
 
IWorkloadFactoryPtr CreateWorkloadFactory (class TensorHandleFactoryRegistry &tensorHandleFactoryRegistry, const ModelOptions &modelOptions) const override
 
IWorkloadFactoryPtr CreateWorkloadFactory (class TensorHandleFactoryRegistry &tensorHandleFactoryRegistry, const ModelOptions &modelOptions, MemorySourceFlags inputFlags, MemorySourceFlags outputFlags) const override
 
std::vector< ITensorHandleFactory::FactoryIdGetHandleFactoryPreferences () const override
 (Optional) Returns a vector of supported TensorHandleFactory ids in preference order. More...
 
void RegisterTensorHandleFactories (TensorHandleFactoryRegistry &registry) override
 (Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor()/IWorkloadFactory::CreateSubtensor() methods must be implemented. More...
 
void RegisterTensorHandleFactories (TensorHandleFactoryRegistry &registry, MemorySourceFlags inputFlags, MemorySourceFlags outputFlags) override
 (Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor()/IWorkloadFactory::CreateSubtensor() methods must be implemented. More...
 
IBackendInternal::IBackendContextPtr CreateBackendContext (const IRuntime::CreationOptions &) const override
 Create the runtime context of the backend. More...
 
IBackendInternal::IBackendProfilingContextPtr CreateBackendProfilingContext (const IRuntime::CreationOptions &, IBackendProfilingPtr &backendProfiling) override
 Create context specifically used for profiling interaction from backends. More...
 
IBackendInternal::ILayerSupportSharedPtr GetLayerSupport () const override
 
IBackendInternal::ILayerSupportSharedPtr GetLayerSupport (const ModelOptions &modelOptions) const override
 
OptimizationViews OptimizeSubgraphView (const SubgraphView &subgraph, const ModelOptions &modelOptions) const override
 
IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext (const ModelOptions &modelOptions) const override
 
std::unique_ptr< ICustomAllocatorGetDefaultAllocator () const override
 Returns the default memory allocator for the backend. More...
 
BackendCapabilities GetCapabilities () const override
 Returns a BackendCapability if the backend lists the capability The BackendCapability must then be inspected to check whether or not that BackendCapability is supported Otherwise returns an EmptyOptional if the BackendCapability is unlisted. More...
 
virtual bool UseCustomMemoryAllocator (std::shared_ptr< ICustomAllocator > allocator, armnn::Optional< std::string &> errMsg) override
 Signals the backend to use a custom memory allocator provided by the user. More...
 
virtual unsigned int GetNumberOfCacheFiles () const override
 Returns the number of files cached if backend supports caching. More...
 
- Public Member Functions inherited from IBackendInternal
 ~IBackendInternal () override=default
 Allow backends created by the factory function to be destroyed through IBackendInternal. More...
 
virtual OptimizationViews OptimizeSubgraphView (const SubgraphView &subgraph) const
 
bool SupportsTensorAllocatorAPI () const
 
ITensorHandleFactory::FactoryId GetBackwardCompatibleFavoriteHandleFactory ()
 
virtual ExecutionData CreateExecutionData (WorkingMemDescriptor &workingMemDescriptor) const
 Returns ExecutionData for the backend. More...
 
virtual void UpdateExecutionData (ExecutionData &executionData, WorkingMemDescriptor &workingMemDescriptor) const
 Update the ExecutionData for a layer. More...
 

Static Public Member Functions

static const BackendIdGetIdStatic ()
 
- Static Public Member Functions inherited from IBackendInternal
static constexpr BackendVersion GetApiVersion ()
 Returns the version of the Backend API. More...
 

Public Attributes

std::shared_ptr< ClBackendCustomAllocatorWrapperm_CustomAllocator
 
bool m_UsingCustomAllocator = false
 

Additional Inherited Members

- Public Types inherited from IBackendInternal
using IWorkloadFactoryPtr = std::unique_ptr< IWorkloadFactory >
 
using IBackendContextPtr = std::unique_ptr< IBackendContext >
 
using IBackendProfilingContextPtr = std::shared_ptr< arm::pipe::IBackendProfilingContext >
 This is the bridge between backend and backend profiling we'll keep it in the backend namespace. More...
 
using IBackendProfilingPtr = std::unique_ptr< arm::pipe::IBackendProfiling >
 
using ILayerSupportSharedPtr = std::shared_ptr< ILayerSupport >
 
using IBackendSpecificModelContextPtr = std::shared_ptr< IBackendModelContext >
 
using IMemoryManagerUniquePtr = std::unique_ptr< IMemoryManager >
 
using IMemoryManagerSharedPtr = std::shared_ptr< IMemoryManager >
 
- Protected Member Functions inherited from IBackendInternal
 IBackendInternal ()=default
 Creation must be done through a specific backend interface. More...
 
- Protected Member Functions inherited from IBackend
 IBackend ()
 
virtual ~IBackend ()
 

Detailed Description

Definition at line 37 of file ClBackend.hpp.

Constructor & Destructor Documentation

◆ ClBackend() [1/2]

ClBackend ( )
inline

Definition at line 40 of file ClBackend.hpp.

40 : m_CustomAllocator(nullptr) {};
std::shared_ptr< ClBackendCustomAllocatorWrapper > m_CustomAllocator
Definition: ClBackend.hpp:299

◆ ClBackend() [2/2]

ClBackend ( std::shared_ptr< ICustomAllocator allocator)
inline

Definition at line 41 of file ClBackend.hpp.

References ClBackend::GetIdStatic(), ClBackend::UseCustomMemoryAllocator(), and ClBackend::~ClBackend().

42  {
43  std::string err;
44  UseCustomMemoryAllocator(allocator, err);
45  }
virtual bool UseCustomMemoryAllocator(std::shared_ptr< ICustomAllocator > allocator, armnn::Optional< std::string &> errMsg) override
Signals the backend to use a custom memory allocator provided by the user.
Definition: ClBackend.hpp:98

◆ ~ClBackend()

~ClBackend ( )
default

Referenced by ClBackend::ClBackend().

Member Function Documentation

◆ CreateBackendContext()

IBackendInternal::IBackendContextPtr CreateBackendContext ( const IRuntime::CreationOptions ) const
overridevirtual

Create the runtime context of the backend.

Implementations may return a default-constructed IBackendContextPtr if no context is needed at runtime. Implementations must throw BackendUnavailableException if the backend cannot be used (for example, necessary accelerator hardware is not present). The default implementation always returns a default-constructed pointer.

Reimplemented from IBackendInternal.

Definition at line 236 of file ClBackend.cpp.

Referenced by ClBackend::GetId().

237 {
238  return IBackendContextPtr{new ClBackendContext{options}};
239 }
std::unique_ptr< IBackendContext > IBackendContextPtr

◆ CreateBackendProfilingContext()

IBackendInternal::IBackendProfilingContextPtr CreateBackendProfilingContext ( const IRuntime::CreationOptions creationOptions,
IBackendProfilingPtr backendProfiling 
)
overridevirtual

Create context specifically used for profiling interaction from backends.

Reimplemented from IBackendInternal.

Definition at line 241 of file ClBackend.cpp.

Referenced by ClBackend::GetId().

243 {
245 }
std::shared_ptr< arm::pipe::IBackendProfilingContext > IBackendProfilingContextPtr
This is the bridge between backend and backend profiling we&#39;ll keep it in the backend namespace...

◆ CreateBackendSpecificModelContext()

IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext ( const ModelOptions modelOptions) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 247 of file ClBackend.cpp.

Referenced by ClBackend::CreateWorkloadFactory(), ClBackend::GetId(), ClBackend::GetLayerSupport(), and ClBackend::OptimizeSubgraphView().

249 {
250  return IBackendSpecificModelContextPtr{new ClBackendModelContext{modelOptions}};
251 }
std::shared_ptr< IBackendModelContext > IBackendSpecificModelContextPtr

◆ CreateMemoryManager()

IBackendInternal::IMemoryManagerUniquePtr CreateMemoryManager ( ) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 51 of file ClBackend.cpp.

References ClBackend::m_CustomAllocator, and ClBackend::m_UsingCustomAllocator.

Referenced by ClBackend::GetId().

52 {
54  {
55  return std::make_unique<ClMemoryManager>(m_CustomAllocator);
56  }
57  return std::make_unique<ClMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
58 }
std::shared_ptr< ClBackendCustomAllocatorWrapper > m_CustomAllocator
Definition: ClBackend.hpp:299
bool m_UsingCustomAllocator
Definition: ClBackend.hpp:300

◆ CreateWorkloadFactory() [1/5]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( const IBackendInternal::IMemoryManagerSharedPtr memoryManager = nullptr) const
overridevirtual

Implements IBackendInternal.

Definition at line 60 of file ClBackend.cpp.

Referenced by ClBackend::GetId().

62 {
63  return std::make_unique<ClWorkloadFactory>(
64  PolymorphicPointerDowncast<ClMemoryManager>(memoryManager));
65 }

◆ CreateWorkloadFactory() [2/5]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( TensorHandleFactoryRegistry registry) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 74 of file ClBackend.cpp.

References ClBackend::m_CustomAllocator, ClBackend::m_UsingCustomAllocator, armnn::Malloc, TensorHandleFactoryRegistry::RegisterCopyAndImportFactoryPair(), TensorHandleFactoryRegistry::RegisterFactory(), and TensorHandleFactoryRegistry::RegisterMemoryManager().

76 {
77  std::shared_ptr<ClMemoryManager> memoryManager;
79  {
80  memoryManager = std::make_shared<ClMemoryManager>(m_CustomAllocator);
81  }
82  else
83  {
84  memoryManager = std::make_shared<ClMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
85  }
86 
87  std::unique_ptr<ITensorHandleFactory> factory = std::make_unique<ClTensorHandleFactory>(memoryManager);
88  std::unique_ptr<ITensorHandleFactory> importFactory = std::make_unique<ClImportTensorHandleFactory>(
89  static_cast<MemorySourceFlags>(MemorySource::Malloc), static_cast<MemorySourceFlags>(MemorySource::Malloc));
90 
91  registry.RegisterCopyAndImportFactoryPair(factory->GetId(), importFactory->GetId());
92  registry.RegisterCopyAndImportFactoryPair(importFactory->GetId(), factory->GetId());
93 
94  registry.RegisterMemoryManager(memoryManager);
95  registry.RegisterFactory(std::move(factory));
96  registry.RegisterFactory(std::move(importFactory));
97 
98  return std::make_unique<ClWorkloadFactory>(
99  PolymorphicPointerDowncast<ClMemoryManager>(memoryManager));
100 }
std::shared_ptr< ClBackendCustomAllocatorWrapper > m_CustomAllocator
Definition: ClBackend.hpp:299
unsigned int MemorySourceFlags
bool m_UsingCustomAllocator
Definition: ClBackend.hpp:300

◆ CreateWorkloadFactory() [3/5]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( const IMemoryManagerSharedPtr memoryManager,
const ModelOptions modelOptions 
) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 67 of file ClBackend.cpp.

References ClBackend::CreateBackendSpecificModelContext().

69 {
70  return std::make_unique<ClWorkloadFactory>(
71  PolymorphicPointerDowncast<ClMemoryManager>(memoryManager), CreateBackendSpecificModelContext(modelOptions));
72 }
IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext(const ModelOptions &modelOptions) const override
Definition: ClBackend.cpp:247

◆ CreateWorkloadFactory() [4/5]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( class TensorHandleFactoryRegistry tensorHandleFactoryRegistry,
const ModelOptions modelOptions 
) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 102 of file ClBackend.cpp.

References ClBackend::CreateBackendSpecificModelContext(), ClBackend::m_CustomAllocator, ClBackend::m_UsingCustomAllocator, armnn::Malloc, TensorHandleFactoryRegistry::RegisterCopyAndImportFactoryPair(), TensorHandleFactoryRegistry::RegisterFactory(), and TensorHandleFactoryRegistry::RegisterMemoryManager().

104 {
105  std::shared_ptr<ClMemoryManager> memoryManager;
107  {
108  memoryManager = std::make_shared<ClMemoryManager>(m_CustomAllocator);
109  }
110  else
111  {
112  memoryManager = std::make_shared<ClMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
113  }
114 
115  std::unique_ptr<ITensorHandleFactory> factory = std::make_unique<ClTensorHandleFactory>(memoryManager);
116  std::unique_ptr<ITensorHandleFactory> importFactory = std::make_unique<ClImportTensorHandleFactory>(
117  static_cast<MemorySourceFlags>(MemorySource::Malloc), static_cast<MemorySourceFlags>(MemorySource::Malloc));
118 
119  registry.RegisterCopyAndImportFactoryPair(factory->GetId(), importFactory->GetId());
120  registry.RegisterCopyAndImportFactoryPair(importFactory->GetId(), factory->GetId());
121 
122  registry.RegisterMemoryManager(memoryManager);
123  registry.RegisterFactory(std::move(factory));
124  registry.RegisterFactory(std::move(importFactory));
125 
126  return std::make_unique<ClWorkloadFactory>(
127  PolymorphicPointerDowncast<ClMemoryManager>(memoryManager), CreateBackendSpecificModelContext(modelOptions));
128 }
std::shared_ptr< ClBackendCustomAllocatorWrapper > m_CustomAllocator
Definition: ClBackend.hpp:299
unsigned int MemorySourceFlags
IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext(const ModelOptions &modelOptions) const override
Definition: ClBackend.cpp:247
bool m_UsingCustomAllocator
Definition: ClBackend.hpp:300

◆ CreateWorkloadFactory() [5/5]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( class TensorHandleFactoryRegistry tensorHandleFactoryRegistry,
const ModelOptions modelOptions,
MemorySourceFlags  inputFlags,
MemorySourceFlags  outputFlags 
) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 130 of file ClBackend.cpp.

References ClBackend::CreateBackendSpecificModelContext(), ClBackend::m_CustomAllocator, ClBackend::m_UsingCustomAllocator, armnn::Malloc, TensorHandleFactoryRegistry::RegisterCopyAndImportFactoryPair(), TensorHandleFactoryRegistry::RegisterFactory(), TensorHandleFactoryRegistry::RegisterMemoryManager(), and armnn::Undefined.

135 {
136  // To allow force import if inputFlags/outputFlags are Undefined, set it as Malloc
137  if (inputFlags == static_cast<MemorySourceFlags>(MemorySource::Undefined))
138  {
139  inputFlags = static_cast<MemorySourceFlags>(MemorySource::Malloc);
140  }
141  if (outputFlags == static_cast<MemorySourceFlags>(MemorySource::Undefined))
142  {
143  outputFlags = static_cast<MemorySourceFlags>(MemorySource::Malloc);
144  }
145  std::shared_ptr<ClMemoryManager> memoryManager;
147  {
148  memoryManager = std::make_shared<ClMemoryManager>(m_CustomAllocator);
149  }
150  else
151  {
152  memoryManager = std::make_shared<ClMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
153  }
154 
155  std::unique_ptr<ITensorHandleFactory> factory = std::make_unique<ClTensorHandleFactory>(memoryManager);
156  std::unique_ptr<ITensorHandleFactory> importFactory = std::make_unique<ClImportTensorHandleFactory>(
157  inputFlags, outputFlags);
158 
159  registry.RegisterCopyAndImportFactoryPair(factory->GetId(), importFactory->GetId());
160  registry.RegisterCopyAndImportFactoryPair(importFactory->GetId(), factory->GetId());
161 
162  registry.RegisterMemoryManager(memoryManager);
163  registry.RegisterFactory(std::move(factory));
164  registry.RegisterFactory(std::move(importFactory));
165 
166  return std::make_unique<ClWorkloadFactory>(
167  PolymorphicPointerDowncast<ClMemoryManager>(memoryManager), CreateBackendSpecificModelContext(modelOptions));
168 }
std::shared_ptr< ClBackendCustomAllocatorWrapper > m_CustomAllocator
Definition: ClBackend.hpp:299
unsigned int MemorySourceFlags
IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext(const ModelOptions &modelOptions) const override
Definition: ClBackend.cpp:247
bool m_UsingCustomAllocator
Definition: ClBackend.hpp:300

◆ GetCapabilities()

BackendCapabilities GetCapabilities ( ) const
inlineoverridevirtual

Returns a BackendCapability if the backend lists the capability The BackendCapability must then be inspected to check whether or not that BackendCapability is supported Otherwise returns an EmptyOptional if the BackendCapability is unlisted.

Reimplemented from IBackendInternal.

Definition at line 93 of file ClBackend.hpp.

References armnn::gpuAccCapabilities.

94  {
95  return gpuAccCapabilities;
96  };
const BackendCapabilities gpuAccCapabilities("GpuAcc", { {"NonConstWeights", false}, {"AsyncExecution", false}, {"ProtectedContentAllocation", true}, {"ConstantTensorsAsInputs", true}, {"PreImportIOTensors", false}, {"ExternallyManagedMemory", true}, {"MultiAxisPacking", false}, {"SingleAxisPacking", true} })

◆ GetDefaultAllocator()

std::unique_ptr< ICustomAllocator > GetDefaultAllocator ( ) const
overridevirtual

Returns the default memory allocator for the backend.

Returns
- Returns unique pointer to the Default Allocator of the Backend

Reimplemented from IBackendInternal.

Definition at line 271 of file ClBackend.cpp.

Referenced by ClBackend::GetId().

272 {
273  return std::make_unique<ClBackendDefaultAllocator>();
274 }

◆ GetHandleFactoryPreferences()

std::vector< ITensorHandleFactory::FactoryId > GetHandleFactoryPreferences ( ) const
overridevirtual

(Optional) Returns a vector of supported TensorHandleFactory ids in preference order.

Reimplemented from IBackendInternal.

Definition at line 170 of file ClBackend.cpp.

References ClTensorHandleFactory::GetIdStatic(), and ClImportTensorHandleFactory::GetIdStatic().

Referenced by ClBackend::GetId().

171 {
172  return std::vector<ITensorHandleFactory::FactoryId> {ClTensorHandleFactory::GetIdStatic(),
174 }
static const FactoryId & GetIdStatic()

◆ GetId()

◆ GetIdStatic()

const BackendId & GetIdStatic ( )
static

Definition at line 45 of file ClBackend.cpp.

References armnn::ClBackendId().

Referenced by ClBackend::ClBackend(), and ClBackend::GetId().

46 {
47  static const BackendId s_Id{ClBackendId()};
48  return s_Id;
49 }
constexpr const char * ClBackendId()
Definition: ClBackendId.hpp:10

◆ GetLayerSupport() [1/2]

IBackendInternal::ILayerSupportSharedPtr GetLayerSupport ( ) const
overridevirtual

Implements IBackendInternal.

Definition at line 253 of file ClBackend.cpp.

Referenced by ClBackend::GetId().

254 {
255  static ILayerSupportSharedPtr layerSupport
256  {
258  };
259  return layerSupport;
260 }
std::shared_ptr< IBackendModelContext > IBackendSpecificModelContextPtr
std::shared_ptr< ILayerSupport > ILayerSupportSharedPtr

◆ GetLayerSupport() [2/2]

IBackendInternal::ILayerSupportSharedPtr GetLayerSupport ( const ModelOptions modelOptions) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 262 of file ClBackend.cpp.

References ClBackend::CreateBackendSpecificModelContext().

263 {
264  static ILayerSupportSharedPtr layerSupport
265  {
266  new ClLayerSupport(CreateBackendSpecificModelContext(modelOptions))
267  };
268  return layerSupport;
269 }
IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext(const ModelOptions &modelOptions) const override
Definition: ClBackend.cpp:247
std::shared_ptr< ILayerSupport > ILayerSupportSharedPtr

◆ GetNumberOfCacheFiles()

virtual unsigned int GetNumberOfCacheFiles ( ) const
inlineoverridevirtual

Returns the number of files cached if backend supports caching.

Returns
- Returns 0 if backend does not support caching otherwise number of files cached

Reimplemented from IBackendInternal.

Definition at line 110 of file ClBackend.hpp.

110 { return 1; }

◆ OptimizeSubgraphView()

OptimizationViews OptimizeSubgraphView ( const SubgraphView subgraph,
const ModelOptions modelOptions 
) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 276 of file ClBackend.cpp.

References armnn::Activation, armnn::Addition, OptimizationViews::AddUntouchedSubgraph(), armnn::BatchNormalization, SubgraphView::beginIConnectable(), Layer::BeginOutputSlots(), armnn::ClAdditionValidate(), armnn::ClBatchNormalizationValidate(), armnn::ClConvolution2dWorkloadValidate(), armnn::ClDepthwiseConvolutionWorkloadValidate(), armnn::ClDivisionWorkloadValidate(), armnn::ClFullyConnectedWorkloadValidate(), armnn::ClMultiplicationWorkloadValidate(), armnn::ClSubtractionValidate(), armnn::Convolution2d, ClBackend::CreateBackendSpecificModelContext(), armnn::DepthwiseConvolution2d, armnn::Division, SubgraphView::endIConnectable(), Layer::EndOutputSlots(), armnn::FullyConnected, Layer::GetAdditionalInformation(), InputSlot::GetConnectedOutputSlot(), Layer::GetGuid(), Layer::GetInputSlot(), Layer::GetName(), OutputSlot::GetNumConnections(), Layer::GetOutputSlot(), OutputSlot::GetOwningLayer(), LayerWithParameters< Parameters >::GetParameters(), OptimizationViews::GetSubstitutions(), OutputSlot::GetTensorInfo(), Layer::GetType(), ClBackendModelContext::IsFastMathEnabled(), BatchNormalizationLayer::m_Beta, Convolution2dDescriptor::m_BiasEnabled, DepthwiseConvolution2dDescriptor::m_BiasEnabled, BatchNormalizationLayer::m_Gamma, BatchNormalizationLayer::m_Mean, BatchNormalizationLayer::m_Variance, armnn::Multiplication, armnn::Pad, armnn::Pooling2d, armnn::Reduce, armnn::ReportUntouchedLayers(), armnn::Subtraction, and armnn::optimizations::pad_fold::TryFoldPadIntoLayer2d().

Referenced by ClBackend::GetId().

278 {
279  OptimizationViews optimizationViews(modelOptions);
280 
281  auto it = subgraph.endIConnectable();
282  bool isFastMathEnabled = false;
283  std::map<LayerGuid, Layer*> untouched;
284 
285  while (it != subgraph.beginIConnectable())
286  {
287  --it;
288  Layer& base = *(PolymorphicDowncast<Layer*>(*it));
289  untouched.insert({base.GetGuid(), &base});
290  }
291 
292  it = subgraph.endIConnectable();
293 #if defined(ARMCOMPUTECL_ENABLED)
295 
296  if (modelContextPtr)
297  {
298  auto clModelOptions = dynamic_cast<ClBackendModelContext*>(modelContextPtr.get());
299  if (clModelOptions)
300  {
301  isFastMathEnabled = clModelOptions->IsFastMathEnabled();
302  }
303  }
304 #endif
305  while (it != subgraph.beginIConnectable())
306  {
307  --it;
308  Layer& base = *(PolymorphicDowncast<Layer*>(*it));
309 
310  // Fuse activation into previous layer if supported by backend
311  if ((base.GetType() == LayerType::DepthwiseConvolution2d || base.GetType() == LayerType::Convolution2d
312  || base.GetType() == LayerType::BatchNormalization || base.GetType() == LayerType::FullyConnected
313  || base.GetType() == LayerType::Addition || base.GetType() == LayerType::Multiplication
314  || base.GetType() == LayerType::Subtraction || base.GetType() == LayerType::Division)
315  && (base.GetAdditionalInformation<ActivationDescriptor>() == nullptr))
316  {
317  for (auto output = base.BeginOutputSlots(); output != base.EndOutputSlots(); ++output)
318  {
319  if (output->GetNumConnections() == 1)
320  {
321  for (auto&& childInput : output->GetConnections())
322  {
323  if ((childInput->GetOwningLayer().GetType() == LayerType::Activation) &&
324  (checkDataTypeInputandOutput(childInput->GetOwningLayer())))
325  {
326  Layer& child = childInput->GetOwningLayer();
327 
328  auto* activationLayer = PolymorphicDowncast<ActivationLayer*>(&child);
329 
330  const std::string name = std::string("fused-") + child.GetName() + std::string("-into-") +
331  base.GetName();
332 
333  // Get params from activation layer
334  ActivationDescriptor activationDesc = activationLayer->GetParameters();
335 
336  if (base.GetType() == LayerType::Convolution2d)
337  {
338  Convolution2dLayer* baseLayer = PolymorphicDowncast<Convolution2dLayer*>(&base);
339 
340  Optional<TensorInfo> biases;
341 
342  if (baseLayer->GetParameters().m_BiasEnabled)
343  {
344  biases = baseLayer->GetInputSlot(2).GetConnectedOutputSlot()->GetTensorInfo();
345  }
346 
348  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
349  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
350  baseLayer->GetParameters(),
351  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
352  biases,
353  isFastMathEnabled,
354  &activationDesc);
355 
356  if (status)
357  {
358  FuseConvolution2dLayer<Convolution2dLayer>(optimizationViews,
359  baseLayer,
360  activationLayer,
361  activationDesc,
362  name);
363  untouched.erase(baseLayer->GetGuid());
364  untouched.erase(activationLayer->GetGuid());
365  }
366  }
367  else if (base.GetType() == LayerType::DepthwiseConvolution2d)
368  {
369  DepthwiseConvolution2dLayer* baseLayer =
370  PolymorphicDowncast<DepthwiseConvolution2dLayer*>(&base);
371 
372  Optional<TensorInfo> biases;
373 
374  if (baseLayer->GetParameters().m_BiasEnabled)
375  {
376  biases = baseLayer->GetInputSlot(2).GetConnectedOutputSlot()->GetTensorInfo();
377  }
378 
380  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
381  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
382  baseLayer->GetParameters(),
383  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
384  biases,
385  &activationDesc);
386 
387  if (status)
388  {
389  FuseDepthwiseConvolution2dLayer<DepthwiseConvolution2dLayer>(optimizationViews,
390  baseLayer,
391  activationLayer,
392  activationDesc,
393  name);
394  untouched.erase(baseLayer->GetGuid());
395  untouched.erase(activationLayer->GetGuid());
396  }
397  }
398  else if (base.GetType() == LayerType::FullyConnected)
399  {
400  FullyConnectedLayer* baseLayer = PolymorphicDowncast<FullyConnectedLayer*>(&base);
401  FullyConnectedDescriptor descriptor = baseLayer->GetParameters();
402 
403  // As bias is optional only try to get TensorInfo from input if bias is enabled.
404  Optional<TensorInfo> biases;
405  if (descriptor.m_BiasEnabled)
406  {
407  biases = baseLayer->GetInputSlot(2).GetConnectedOutputSlot()->GetTensorInfo();
408  }
409 
411  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
412  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
413  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
414  biases,
415  baseLayer->GetParameters(),
416  &activationDesc);
417 
418  if (status)
419  {
420  FuseFullyConnectedLayer<FullyConnectedLayer>(optimizationViews,
421  baseLayer,
422  activationLayer,
423  activationDesc,
424  name);
425  untouched.erase(baseLayer->GetGuid());
426  untouched.erase(activationLayer->GetGuid());
427  }
428  }
429  else if (base.GetType() == LayerType::BatchNormalization)
430  {
431  BatchNormalizationLayer* baseLayer =
432  PolymorphicDowncast<BatchNormalizationLayer*>(&base);
433 
435  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
436  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
437  baseLayer->m_Mean->GetTensorInfo(),
438  baseLayer->m_Variance->GetTensorInfo(),
439  baseLayer->m_Beta->GetTensorInfo(),
440  baseLayer->m_Gamma->GetTensorInfo(),
441  baseLayer->GetParameters(),
442  &activationDesc);
443 
444  if (status)
445  {
446  BatchNormalizationLayer* replacementLayer =
447  FuseBatchNormalizationLayer<BatchNormalizationLayer>(optimizationViews,
448  baseLayer,
449  activationLayer,
450  activationDesc,
451  name);
452 
453  replacementLayer->m_Beta = std::move(baseLayer->m_Beta);
454  replacementLayer->m_Gamma = std::move(baseLayer->m_Gamma);
455  replacementLayer->m_Mean = std::move(baseLayer->m_Mean);
456  replacementLayer->m_Variance = std::move(baseLayer->m_Variance);
457  untouched.erase(baseLayer->GetGuid());
458  untouched.erase(activationLayer->GetGuid());
459  }
460  }
461  else if (base.GetType() == LayerType::Addition)
462  {
463  AdditionLayer* baseLayer = PolymorphicDowncast<AdditionLayer*>(&base);
464 
466  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
467  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
468  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
469  &activationDesc);
470 
471  if (status)
472  {
473  FuseAdditionLayer<AdditionLayer>(optimizationViews,
474  baseLayer,
475  activationLayer,
476  activationDesc,
477  name);
478  untouched.erase(baseLayer->GetGuid());
479  untouched.erase(activationLayer->GetGuid());
480  }
481  }
482  else if (base.GetType() == LayerType::Division)
483  {
484  DivisionLayer* baseLayer = PolymorphicDowncast<DivisionLayer*>(&base);
485 
487  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
488  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
489  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
490  &activationDesc);
491 
492  if (status)
493  {
494  FuseDivisionLayer<DivisionLayer>(optimizationViews,
495  baseLayer,
496  activationLayer,
497  activationDesc,
498  name);
499  untouched.erase(baseLayer->GetGuid());
500  untouched.erase(activationLayer->GetGuid());
501  }
502  }
503  else if (base.GetType() == LayerType::Multiplication)
504  {
505  MultiplicationLayer* baseLayer = PolymorphicDowncast<MultiplicationLayer*>(&base);
506 
508  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
509  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
510  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
511  &activationDesc);
512 
513  if (status)
514  {
515  FuseMultiplicationLayer<MultiplicationLayer>(optimizationViews,
516  baseLayer,
517  activationLayer,
518  activationDesc,
519  name);
520  untouched.erase(baseLayer->GetGuid());
521  untouched.erase(activationLayer->GetGuid());
522  }
523  }
524  else if (base.GetType() == LayerType::Subtraction)
525  {
526  SubtractionLayer* baseLayer = PolymorphicDowncast<SubtractionLayer*>(&base);
527 
529  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
530  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
531  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
532  &activationDesc);
533 
534  if (status)
535  {
536  FuseSubtractionLayer<SubtractionLayer>(optimizationViews,
537  baseLayer,
538  activationLayer,
539  activationDesc,
540  name);
541  untouched.erase(baseLayer->GetGuid());
542  untouched.erase(activationLayer->GetGuid());
543  }
544  }
545  }
546  }
547  }
548  }
549  }
550 
551  // Separate reduce layer with multiple axes into multiple reduce layers with 1 axis.
552  if (base.GetType() == LayerType::Reduce)
553  {
554  ReduceLayer* baseLayer = PolymorphicDowncast<ReduceLayer*>(&base);
555  ReduceDescriptor reduceDescriptor = baseLayer->GetParameters();
556 
557  if (!reduceDescriptor.m_vAxis.empty() && reduceDescriptor.m_vAxis.size() > 1)
558  {
559  // Add new layers to the graph and connect them.
560  std::vector<IConnectableLayer*> layers = ChainReduceLayers<ReduceLayer>(optimizationViews,
561  baseLayer,
562  reduceDescriptor);
563 
564  // Replace existing baselayer with new subgraph.
565  ReplaceLayers<ReduceLayer>(optimizationViews, baseLayer, layers);
566  untouched.erase(baseLayer->GetGuid());
567  }
568  }
569 
570  // Special case to fuse padding into average pooling 2d for quantized datatype.
571  // Required to be done as a backend specific optimization as Neon does not support this special case.
572  if (base.GetType() == LayerType::Pooling2d)
573  {
574  Pooling2dLayer* baseLayer = PolymorphicDowncast<Pooling2dLayer*>(&base);
575  Pooling2dDescriptor poolingDescriptor = baseLayer->GetParameters();
576 
577  if (baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetOwningLayer().GetType() == LayerType::Pad)
578  {
579  PadLayer* padLayer = PolymorphicDowncast<PadLayer*>(
580  &baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetOwningLayer());
581  if (padLayer->GetOutputSlot(0).GetNumConnections() == 1 &&
582  optimizations::pad_fold::TryFoldPadIntoLayer2d(padLayer->GetParameters(),
583  poolingDescriptor,
584  padLayer->GetOutputSlot().GetTensorInfo(),
585  true))
586  {
587  FoldPadIntoAveragePool2d<Pooling2dLayer>(optimizationViews, baseLayer,
588  poolingDescriptor, padLayer);
589  untouched.erase(baseLayer->GetGuid());
590  untouched.erase(padLayer->GetGuid());
591  }
592  }
593  }
594  }
595 
596  if (optimizationViews.GetSubstitutions().empty())
597  {
598  optimizationViews.AddUntouchedSubgraph(SubgraphView(subgraph));
599  }
600  else
601  {
602  ReportUntouchedLayers(optimizationViews, untouched);
603  }
604 
605  return optimizationViews;
606 }
arm_compute::Status ClAdditionValidate(const TensorInfo &input0, const TensorInfo &input1, const TensorInfo &output, const ActivationDescriptor *activationDescriptor)
arm_compute::Status ClDivisionWorkloadValidate(const TensorInfo &input0, const TensorInfo &input1, const TensorInfo &output, const ActivationDescriptor *activationDescriptor)
void ReportUntouchedLayers(OptimizationViews &optimizationViews, std::map< LayerGuid, Layer *> untouched)
arm_compute::Status ClSubtractionValidate(const TensorInfo &input0, const TensorInfo &input1, const TensorInfo &output, const ActivationDescriptor *activationDescriptor)
arm_compute::Status ClConvolution2dWorkloadValidate(const TensorInfo &input, const TensorInfo &output, const Convolution2dDescriptor &descriptor, const TensorInfo &weights, const Optional< TensorInfo > &biases, bool isFastMathEnabled, const ActivationDescriptor *activationDescriptor)
const armnnSerializer::Pooling2dDescriptor * Pooling2dDescriptor
IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext(const ModelOptions &modelOptions) const override
Definition: ClBackend.cpp:247
bool TryFoldPadIntoLayer2d(const PadDescriptor &padDescriptor, Descriptor &layerDescriptor, const TensorInfo &tensorInfo)
std::shared_ptr< IBackendModelContext > IBackendSpecificModelContextPtr
arm_compute::Status ClMultiplicationWorkloadValidate(const TensorInfo &input0, const TensorInfo &input1, const TensorInfo &output, const ActivationDescriptor *activationDescriptor)
Status
enumeration
Definition: Types.hpp:42
arm_compute::Status ClFullyConnectedWorkloadValidate(const TensorInfo &input, const TensorInfo &output, const TensorInfo &weights, const Optional< TensorInfo > &biases, const FullyConnectedDescriptor &descriptor, const ActivationDescriptor *activationDescriptor)
arm_compute::Status ClBatchNormalizationValidate(const TensorInfo &input, const TensorInfo &output, const TensorInfo &mean, const TensorInfo &var, const TensorInfo &beta, const TensorInfo &gamma, const BatchNormalizationDescriptor &descriptor, const ActivationDescriptor *activationDescriptor)
arm_compute::Status ClDepthwiseConvolutionWorkloadValidate(const TensorInfo &input, const TensorInfo &output, const DepthwiseConvolution2dDescriptor &descriptor, const TensorInfo &weights, const Optional< TensorInfo > &biases, const ActivationDescriptor *activationDescriptor)

◆ RegisterTensorHandleFactories() [1/2]

void RegisterTensorHandleFactories ( TensorHandleFactoryRegistry )
overridevirtual

(Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor()/IWorkloadFactory::CreateSubtensor() methods must be implemented.

Reimplemented from IBackendInternal.

Definition at line 176 of file ClBackend.cpp.

References ClBackend::m_CustomAllocator, ClBackend::m_UsingCustomAllocator, armnn::Malloc, TensorHandleFactoryRegistry::RegisterCopyAndImportFactoryPair(), TensorHandleFactoryRegistry::RegisterFactory(), and TensorHandleFactoryRegistry::RegisterMemoryManager().

Referenced by ClBackend::GetId().

177 {
178  std::shared_ptr<ClMemoryManager> memoryManager;
180  {
181  memoryManager = std::make_shared<ClMemoryManager>(m_CustomAllocator);
182  }
183  else
184  {
185  memoryManager = std::make_shared<ClMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
186  }
187 
188  std::unique_ptr<ITensorHandleFactory> factory = std::make_unique<ClTensorHandleFactory>(memoryManager);
189  std::unique_ptr<ITensorHandleFactory> importFactory = std::make_unique<ClImportTensorHandleFactory>(
190  static_cast<MemorySourceFlags>(MemorySource::Malloc), static_cast<MemorySourceFlags>(MemorySource::Malloc));
191 
192  registry.RegisterCopyAndImportFactoryPair(factory->GetId(), importFactory->GetId());
193  registry.RegisterCopyAndImportFactoryPair(importFactory->GetId(), factory->GetId());
194 
195  registry.RegisterMemoryManager(memoryManager);
196  registry.RegisterFactory(std::move(factory));
197  registry.RegisterFactory(std::move(importFactory));
198 
199 }
std::shared_ptr< ClBackendCustomAllocatorWrapper > m_CustomAllocator
Definition: ClBackend.hpp:299
unsigned int MemorySourceFlags
bool m_UsingCustomAllocator
Definition: ClBackend.hpp:300

◆ RegisterTensorHandleFactories() [2/2]

void RegisterTensorHandleFactories ( TensorHandleFactoryRegistry registry,
MemorySourceFlags  inputFlags,
MemorySourceFlags  outputFlags 
)
overridevirtual

(Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor()/IWorkloadFactory::CreateSubtensor() methods must be implemented.

Reimplemented from IBackendInternal.

Definition at line 201 of file ClBackend.cpp.

References ClBackend::m_CustomAllocator, ClBackend::m_UsingCustomAllocator, armnn::Malloc, TensorHandleFactoryRegistry::RegisterCopyAndImportFactoryPair(), TensorHandleFactoryRegistry::RegisterFactory(), TensorHandleFactoryRegistry::RegisterMemoryManager(), and armnn::Undefined.

204 {
205  // To allow force import if inputFlags/outputFlags are Undefined, set it as Malloc
206  if (inputFlags == static_cast<MemorySourceFlags>(MemorySource::Undefined))
207  {
208  inputFlags = static_cast<MemorySourceFlags>(MemorySource::Malloc);
209  }
210  if (outputFlags == static_cast<MemorySourceFlags>(MemorySource::Undefined))
211  {
212  outputFlags = static_cast<MemorySourceFlags>(MemorySource::Malloc);
213  }
214  std::shared_ptr<ClMemoryManager> memoryManager;
216  {
217  memoryManager = std::make_shared<ClMemoryManager>(m_CustomAllocator);
218  }
219  else
220  {
221  memoryManager = std::make_shared<ClMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
222  }
223 
224  std::unique_ptr<ITensorHandleFactory> factory = std::make_unique<ClTensorHandleFactory>(memoryManager);
225  std::unique_ptr<ITensorHandleFactory> importFactory = std::make_unique<ClImportTensorHandleFactory>(
226  inputFlags, outputFlags);
227 
228  registry.RegisterCopyAndImportFactoryPair(factory->GetId(), importFactory->GetId());
229  registry.RegisterCopyAndImportFactoryPair(importFactory->GetId(), factory->GetId());
230 
231  registry.RegisterMemoryManager(memoryManager);
232  registry.RegisterFactory(std::move(factory));
233  registry.RegisterFactory(std::move(importFactory));
234 }
std::shared_ptr< ClBackendCustomAllocatorWrapper > m_CustomAllocator
Definition: ClBackend.hpp:299
unsigned int MemorySourceFlags
bool m_UsingCustomAllocator
Definition: ClBackend.hpp:300

◆ UseCustomMemoryAllocator()

virtual bool UseCustomMemoryAllocator ( std::shared_ptr< ICustomAllocator allocator,
armnn::Optional< std::string &>  errMsg 
)
inlineoverridevirtual

Signals the backend to use a custom memory allocator provided by the user.

Parameters
allocator- a pointer to the provided ICustomAllocator to use with this backend
errMsg- Optional string variable to return error messages
Returns
- Returns true if switching to custom allocator was successful

Reimplemented from IBackendInternal.

Definition at line 98 of file ClBackend.hpp.

References ARMNN_LOG, armnn::IgnoreUnused(), armnn::info, ClBackend::m_CustomAllocator, and ClBackend::m_UsingCustomAllocator.

Referenced by ClBackend::ClBackend().

100  {
101  IgnoreUnused(errMsg);
102  ARMNN_LOG(info) << "Using Custom Allocator for ClBackend";
103 
104  // Set flag to signal the backend to use a custom memory allocator
105  m_CustomAllocator = std::make_shared<ClBackendCustomAllocatorWrapper>(std::move(allocator));
106  m_UsingCustomAllocator = true;
107  return m_UsingCustomAllocator;
108  }
#define ARMNN_LOG(severity)
Definition: Logging.hpp:205
std::shared_ptr< ClBackendCustomAllocatorWrapper > m_CustomAllocator
Definition: ClBackend.hpp:299
void IgnoreUnused(Ts &&...)
bool m_UsingCustomAllocator
Definition: ClBackend.hpp:300

Member Data Documentation

◆ m_CustomAllocator

◆ m_UsingCustomAllocator


The documentation for this class was generated from the following files: