From 816e48390c65b5487c1e2525b930b935ca5f4293 Mon Sep 17 00:00:00 2001
From: Sang-Hoon Park <sang-hoon.park@arm.com>
Date: Wed, 21 Apr 2021 14:26:49 +0100
Subject: Update API documentation

API documentation is updated to have description for
- Tensor
- Internal architecture regarding operators and kernels
- Supported data type and layout

Partially Resolves: COMPMID-4200

Change-Id: I17011be2890c724014acd3543d688eb5124ff944
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5501
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
---
 docs/08_api.dox | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/docs/08_api.dox b/docs/08_api.dox
index a73b1bd351..29d31c831d 100644
--- a/docs/08_api.dox
+++ b/docs/08_api.dox
@@ -42,7 +42,7 @@ construction services.
 Compute Library consists of a list of fundamental objects that are responsible for creating and orchestrating operator execution.
 Below we present these objects in more detail.
 
-@subsection api_objects_context @ref AclContext or @ref Context
+@subsection api_objects_context AclContext or Context
 
 AclContext or Context acts as a central creational aggregate service. All other objects are bound to or created from a context.
 It provides, internally, common facilities such as
@@ -52,13 +52,13 @@ It provides, internally, common facilities such as
 
 The followings sections will describe parameters that can be given on the creation of Context.
 
-@subsubsection api_object_context_target @ref AclTarget
+@subsubsection api_object_context_target AclTarget
 Context is initialized with a backend target (AclTarget) as different backends might have a different subset of services.
 Currently the following targets are supported:
 - #AclCpu: a generic CPU target that accelerates primitives through SIMD technologies
 - #AclGpuOcl: a target for GPU acceleration using OpenCL
 
-@subsubsection api_object_context_execution_mode @ref AclExecutionMode
+@subsubsection api_object_context_execution_mode AclExecutionMode
 An execution mode (AclExecutionMode) can be passed as an argument that affects the operator creation.
 At the moment the following execution modes are supported:
 - #AclPreferFastRerun: Provides faster re-run. It can be used when the operators are expected to be executed multiple
@@ -66,7 +66,7 @@ times under the same execution context
 - #AclPreferFastStart: Provides faster single execution. It can be used when the operators will be executed only once,
 thus reducing their latency is important (Currently, it is not implemented)
 
-@subsubsection api_object_context_capabilitys @ref AclTargetCapabilities
+@subsubsection api_object_context_capabilitys AclTargetCapabilities
 Context creation can also have a list of capabilities of hardware as one of its parameters. This is currently
 available only for the CPU backend. A list of architecture capabilities can be passed to influence the selection
 of the underlying kernels. Such capabilities can be for example the enablement of SVE or the dot product
@@ -79,5 +79,47 @@ This user-provided allocator will be used for allocation of any internal backing
 
 @note To enable interoperability with OpenCL, additional entrypoints are provided
 to extract (@ref AclGetClContext) or set (@ref AclSetClContext) the internal OpenCL context.
+
+@subsection api_objects_tensor AclTensor or Tensor
+
+A tensor is a mathematical object that can describe physical properties like matrices.
+It can be also considered a generalization of matrices that can represent arbitrary
+dimensionalities. AclTensor is an abstracted interface that represents a tensor.
+
+AclTensor, in addition to the elements of the physical properties they represent,
+also contains the information such as shape, data type, data layout and strides to not only
+fully describe the characteristics of the physical properties but also provide information
+how the object stored in memory should be traversed. @ref AclTensorDescriptor is a dedicated
+object to represent such metadata.
+
+@note The allocation of an AclTensor can be deferred until external memory is imported
+as backing memory to accomplish a zero-copy context.
+
+@note To enable interoperability with OpenCL, additional entrypoints are provided
+to extract (@ref AclGetClMem) the internal OpenCL memory object.
+
+As Tensors can reside in different memory spaces, @ref AclMapTensor and @ref AclUnmapTensor entrypoints
+are provided to map Tensors in and out of the host memory system, respectively.
+
+@section api_internal Internal
+@subsection api_internal_operator_vs_kernels Operators vs Kernels
+
+Internally, Compute Library separates the executable primitives in two categories: kernels and operators
+which operate in a hierarchical way.
+
+A kernel is the lowest-level computation block whose responsibility is performing a task on a given group of data.
+For design simplicity, kernels computation does NOT involve the following:
+
+- Memory allocation: All the memory manipulation should be handled by the caller.
+- Multi-threading: The information on how the workload can be split is provided by kernels,
+so the caller can effectively distribute the workload to multiple threads.
+
+On the other hand, operators combine one or multiple kernels to achieve more complex calculations.
+The responsibilities of the operators can be summarized as follows:
+
+- Defining the scheduling policy and dispatching of the underlying kernels to the hardware backend
+- Providing information to the caller required by the computation (e.g., memory requirements)
+- Allocation of any required auxiliary memory if it isn't given by its caller explicitly
+
 */
 } // namespace arm_compute
-- 
cgit v1.2.1