From c9309f22a026dfce92365e2f0802c40e8e1c449e Mon Sep 17 00:00:00 2001 From: Sang-Hoon Park Date: Wed, 5 May 2021 10:34:47 +0100 Subject: Restructure Documentation (Part 2) The followings are done: - Move operator list documnetation - Introduction page is moved to the top - The sections for experimental API and programming model are merged into library architecture page. Resolves: COMPMID-4198 Change-Id: Iad824d6c8ba8d31e0bf76afd3fb67abbe32a1667 Signed-off-by: Sang-Hoon Park Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5570 Tested-by: Arm Jenkins Reviewed-by: Michele Di Giorgio Reviewed-by: Sheri Zhang Comments-Addressed: Arm Jenkins --- docs/user_guide/api.dox | 135 ------------------------------------------------ 1 file changed, 135 deletions(-) delete mode 100644 docs/user_guide/api.dox (limited to 'docs/user_guide/api.dox') diff --git a/docs/user_guide/api.dox b/docs/user_guide/api.dox deleted file mode 100644 index 39282046a9..0000000000 --- a/docs/user_guide/api.dox +++ /dev/null @@ -1,135 +0,0 @@ -/// -/// Copyright (c) 2021 Arm Limited. -/// -/// SPDX-License-Identifier: MIT -/// -/// Permission is hereby granted, free of charge, to any person obtaining a copy -/// of this software and associated documentation files (the "Software"), to -/// deal in the Software without restriction, including without limitation the -/// rights to use, copy, modify, merge, publish, distribute, sublicense, and/or -/// sell copies of the Software, and to permit persons to whom the Software is -/// furnished to do so, subject to the following conditions: -/// -/// The above copyright notice and this permission notice shall be included in all -/// copies or substantial portions of the Software. -/// -/// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -/// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -/// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -/// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -/// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -/// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -/// SOFTWARE. -/// -namespace arm_compute -{ -/** -@page api Application Programming Interface - -@tableofcontents - -@section api_overview Overview - -In this section we present Compute Library's application programming interface (API) architecture along with -a detailed explanation of its components. Compute Library's API consists of multiple high-level operators and -even more internally distinct computational blocks that can be executed on a command queue. -Operators can be bound to multiple Tensor objects and executed concurrently or asynchronously if needed. -All operators and associated objects are encapsulated in a Context-based mechanism, which provides all related -construction services. - -@section api_objects Fundamental objects - -Compute Library consists of a list of fundamental objects that are responsible for creating and orchestrating operator execution. -Below we present these objects in more detail. - -@subsection api_objects_context AclContext or Context - -AclContext or Context acts as a central creational aggregate service. All other objects are bound to or created from a context. -It provides, internally, common facilities such as -- allocators for object creation or backing memory allocation -- serialization interfaces -- any other modules that affect the construction of objects (e.g., program cache for OpenCL). - -The followings sections will describe parameters that can be given on the creation of Context. - -@subsubsection api_object_context_target AclTarget -Context is initialized with a backend target (AclTarget) as different backends might have a different subset of services. -Currently the following targets are supported: -- #AclCpu: a generic CPU target that accelerates primitives through SIMD technologies -- #AclGpuOcl: a target for GPU acceleration using OpenCL - -@subsubsection api_object_context_execution_mode AclExecutionMode -An execution mode (AclExecutionMode) can be passed as an argument that affects the operator creation. -At the moment the following execution modes are supported: -- #AclPreferFastRerun: Provides faster re-run. It can be used when the operators are expected to be executed multiple -times under the same execution context -- #AclPreferFastStart: Provides faster single execution. It can be used when the operators will be executed only once, -thus reducing their latency is important (Currently, it is not implemented) - -@subsubsection api_object_context_capabilitys AclTargetCapabilities -Context creation can also have a list of capabilities of hardware as one of its parameters. This is currently -available only for the CPU backend. A list of architecture capabilities can be passed to influence the selection -of the underlying kernels. Such capabilities can be for example the enablement of SVE or the dot product -instruction explicitly. -@note The underlying hardware should support the given capability list. - -@subsubsection api_object_context_allocator Allocator -An allocator object that implements @ref AclAllocator can be passed to the Context upon its creation. -This user-provided allocator will be used for allocation of any internal backing memory. - -@note To enable interoperability with OpenCL, additional entrypoints are provided -to extract (@ref AclGetClContext) or set (@ref AclSetClContext) the internal OpenCL context. - -@subsection api_objects_tensor AclTensor or Tensor - -A tensor is a mathematical object that can describe physical properties like matrices. -It can be also considered a generalization of matrices that can represent arbitrary -dimensionalities. AclTensor is an abstracted interface that represents a tensor. - -AclTensor, in addition to the elements of the physical properties they represent, -also contains the information such as shape, data type, data layout and strides to not only -fully describe the characteristics of the physical properties but also provide information -how the object stored in memory should be traversed. @ref AclTensorDescriptor is a dedicated -object to represent such metadata. - -@note The allocation of an AclTensor can be deferred until external memory is imported -as backing memory to accomplish a zero-copy context. - -@note To enable interoperability with OpenCL, additional entrypoints are provided -to extract (@ref AclGetClMem) the internal OpenCL memory object. - -As Tensors can reside in different memory spaces, @ref AclMapTensor and @ref AclUnmapTensor entrypoints -are provided to map Tensors in and out of the host memory system, respectively. - -@subsection api_objects_queue AclQueue or Queue - -AclQueue acts as a runtime aggregate service. It provides facilities to schedule -and execute operators using underlying hardware. It also contains services like -tuning mechanisms (e.g., Local workgroup size tuning for OpenCL) that can be specified -during operator execution. - -@note To enable interoperability with OpenCL, additional entrypoints are provided -to extract (@ref AclGetClQueue) or set (@ref AclSetClQueue) the internal OpenCL queue. - -@section api_internal Internal -@subsection api_internal_operator_vs_kernels Operators vs Kernels - -Internally, Compute Library separates the executable primitives in two categories: kernels and operators -which operate in a hierarchical way. - -A kernel is the lowest-level computation block whose responsibility is performing a task on a given group of data. -For design simplicity, kernels computation does NOT involve the following: - -- Memory allocation: All the memory manipulation should be handled by the caller. -- Multi-threading: The information on how the workload can be split is provided by kernels, -so the caller can effectively distribute the workload to multiple threads. - -On the other hand, operators combine one or multiple kernels to achieve more complex calculations. -The responsibilities of the operators can be summarized as follows: - -- Defining the scheduling policy and dispatching of the underlying kernels to the hardware backend -- Providing information to the caller required by the computation (e.g., memory requirements) -- Allocation of any required auxiliary memory if it isn't given by its caller explicitly - -*/ -} // namespace arm_compute -- cgit v1.2.1