/// /// Copyright (c) 2021 Arm Limited. /// /// SPDX-License-Identifier: MIT /// /// Permission is hereby granted, free of charge, to any person obtaining a copy /// of this software and associated documentation files (the "Software"), to /// deal in the Software without restriction, including without limitation the /// rights to use, copy, modify, merge, publish, distribute, sublicense, and/or /// sell copies of the Software, and to permit persons to whom the Software is /// furnished to do so, subject to the following conditions: /// /// The above copyright notice and this permission notice shall be included in all /// copies or substantial portions of the Software. /// /// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR /// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, /// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE /// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER /// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, /// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE /// SOFTWARE. /// namespace arm_compute { /** @page operators_list Supported Operators @tableofcontents @section S9_1_operators_list Supported Operators Compute Library supports operators that are listed in below table. Compute Library supports a wide list of data-types, information can been directly found in the documentation of each kernel/function. The main data-types that the Machine Learning functions support are the following:

BFLOAT16: 16-bit non-standard brain floating point
QASYMM8: 8-bit unsigned asymmetric quantized
QASYMM8_SIGNED: 8-bit signed asymmetric quantized
QSYMM8_PER_CHANNEL: 8-bit signed symmetric quantized (Used for the weights)
QSYMM8: 8-bit unsigned symmetric quantized
QSYMM16: 16-bit unsigned symmetric quantized
F32: 32-bit single precision floating point
F16: 16-bit half precision floating point
S32: 32-bit signed integer
U8: 8-bit unsigned char
All: include all above data types

Compute Library supports the following data layouts (fast changing dimension from right to left):

NHWC: The native layout of Compute Library that delivers the best performance where channels are in the fastest changing dimension
NCHW: Legacy layout where width is in the fastest changing dimension
All: include all above data layouts

where N = batches, C = channels, H = height, W = width

Function

Description

Equivalent Android NNAPI Op

Backends

Data Layouts

Data Types

ActivationLayer

Function to simulate an activation layer with the specified activation function.

ANEURALNETWORKS_ELU
ANEURALNETWORKS_HARD_SWISH
ANEURALNETWORKS_LOGISTIC
ANEURALNETWORKS_RELU
ANEURALNETWORKS_RELU1
ANEURALNETWORKS_RELU6
ANEURALNETWORKS_TANH

NEActivationLayer

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16
F16	F16
F32	F32

CLActivationLayer

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16
F16	F16
F32	F32

ArgMinMaxLayer

Function to calculate the index of the minimum or maximum values in a tensor based on an axis.

ANEURALNETWORKS_ARGMAX
ANEURALNETWORKS_ARGMIN

NEArgMinMaxLayer

src	dst
QASYMM8	U32, S32
QASYMM8_SIGNED	U32, S32
S32	U32, S32
F16	U32, S32
F32	U32, S32

CLArgMinMaxLayer

src	dst
QASYMM8	U32, S32
QASYMM8_SIGNED	U32, S32
S32	U32, S32
F16	U32, S32
F32	U32, S32

ArithmeticAddition

Function to add 2 tensors.

ANEURALNETWORKS_ADD

NEArithmeticAddition

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16	QASYMM16
QSYMM16	QSYMM16	S32
U8	U8	U8
U8	U8	S16
U8	S16	S16
S16	U8	S16
S16	S16	S16
S32	S32	S32
F16	F16	F16
F32	F32	F32

ArithmeticSubtraction

Function to substract 2 tensors.

ANEURALNETWORKS_SUB

NEArithmeticSubtraction

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16	QASYMM16
QSYMM16	QSYMM16	S32
U8	U8	U8
U8	U8	S16
U8	S16	S16
S16	U8	S16
S16	S16	S16
S32	S32	S32
F16	F16	F16
F32	F32	F32

BatchNormalizationLayer

Function to perform batch normalization.

NEBatchNormalizationLayer

NHWC
NCHW

src	dst
F32	F32
F16	F16

CLBatchNormalizationLayer

NHWC
NCHW

src	dst
F32	F32
F16	F16

BatchToSpaceLayer

Batch to space transformation.

ANEURALNETWORKS_BATCH_TO_SPACE_ND

NEBatchToSpaceLayer

NHWC
NCHW

src0	src1	dst
All	s32	All

CLBatchToSpaceLayer

NHWC
NCHW

src0	src1	dst
All	s32	All

BitwiseAnd

Function to performe bitwise AND between 2 tensors.

ANEURALNETWORKS_LOGICAL_AND

NEBitwiseAnd

src	dst
U8	U8

CLBitwiseAnd

src	dst
U8	U8

BitwiseNot

Function to performe bitwise NOT.

ANEURALNETWORKS_LOGICAL_NOT

NEBitwiseNot

src	dst
U8	U8

CLBitwiseNot

src	dst
U8	U8

BitwiseOr

Function to performe bitwise OR between 2 tensors.

ANEURALNETWORKS_LOGICAL_OR

NEBitwiseOr

src	dst
U8	U8

CLBitwiseOr

src	dst
U8	U8

BitwiseXor

Function to performe bitwise XOR between 2 tensors.

NEBitwiseXor

src	dst
U8	U8

CLBitwiseXor

src	dst
U8	U8

BoundingBoxTransform

Transform proposal bounding boxes to target bounding box using bounding box deltas.

NEBoundingBoxTransform

NHWC
NCHW

src0	src1	dst
QASYMM16	QASYMM8	QASYMM16
F16	F16	F16
F32	F32	F32

CLBoundingBoxTransform

NHWC
NCHW

src0	src1	dst
QASYMM16	QASYMM8	QASYMM16
F16	F16	F16
F32	F32	F32

Cast

Function to cast a tensor.

ANEURALNETWORKS_CAST

NECast

src	dst
QASYMM8_SIGNED	S16, S32, F32, F16
QASYMM8	U16, S16, S32, F32, F16
U8	U16, S16, S32, F32, F16
U16	U8, U32
S16	QASYMM8_SIGNED, U8, S32
F16	QASYMM8_SIGNED, QASYMM8, F32, S32, U8
S32	QASYMM8_SIGNED, QASYMM8, F16, F32, U8
F32	QASYMM8_SIGNED, QASYMM8, BFLOAT16, F16, S32, U8

CLCast

src	dst
U8	S8, U16, S16, U32, S32, F16, F32
U16	U8, S8, S16, U32, S32, F16, F32
S16	U8, S8, U16, U32, S32, F16, F32
U32	U8, S8, U16, S16, S32, F16, F32
S32	U8, S8, U16, S16, U32, F16, F32
F16	U8, S8, U16, S16, U32, F32
F32	U8, S8, U16, S16, U32, F16

ChannelShuffleLayer

Function to shuffle the channels of the input tensor.

ANEURALNETWORKS_CHANNEL_SHUFFLE

NEChannelShuffleLayer

NCHW

src	dst
All	All

CLChannelShuffleLayer

NCHW

src	dst
All	All

Comparison

Function to compare 2 tensors.

ANEURALNETWORKS_EQUAL
ANEURALNETWORKS_GREATER
ANEURALNETWORKS_GREATER_EQUAL
ANEURALNETWORKS_LESS
ANEURALNETWORKS_LESS_EQUAL
ANEURALNETWORKS_NOT_EQUAL

CLComparison

src0	src1	dst
All	All	U8

ConcatenateLayer

Function to concatenate tensors along a given axis.

ANEURALNETWORKS_CONCATENATION

NEConcatenateLayer

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

CLConcatenateLayer

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

ConvertFullyConnectedWeights

Function to tranpose the wieghts for the fully connected layer.

NEConvertFullyConnectedWeights

NHWC
NCHW

src	dst
All	All

CLConvertFullyConnectedWeights

NHWC
NCHW

src	dst
All	All

ConvolutionLayer

Function to compute a convolution layer.

ANEURALNETWORKS_CONV_2D

NEConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

CLConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

Copy

Function to copy a tensor.

NECopy

src	dst
All	All

CLCopy

src	dst
All	All

Crop

Performs a copy of input tensor to the output tensor.

CLCrop

NHWC

src	dst
All	F32

CropResize

Function to perform cropping and resizing.

NECropResize

NHWC

src0	src1	src2	dst
All	F32	F32	F32

CLCropResize

NHWC

src0	src1	src2	dst
All	F32	F32	F32

DeconvolutionLayer

Function to compute a deconvolution or tranpose convolution.

ANEURALNETWORKS_TRANSPOSE_CONV_2D

NEDeconvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

CLDeconvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

DeconvolutionLayerUpsample

Function to execute deconvolution upsample on OpenCL.

ANEURALNETWORKS_TRANSPOSE_CONV_2D

CLDeconvolutionLayerUpsample

NHWC
NCHW

src	dst
All	All

DepthConvertLayer

Performs a down-scaling depth conversion.

NEDepthConvertLayer

src	dst
QASYMM8	F16, F32
U8	U16, S16, S32
U16	U8, U32
S16	U8, S32
BFLOAT16	F32
F16	QASYMM8, F32
F32	QASYMM8, F16, BFLOAT16

CLDepthConvertLayer

src	dst
U8	S8, U16, S16, U32, S32, F16, F32
U16	U8, S8, S16, U32, S32, F16, F32
S16	U8, S8, U16, U32, S32, F16, F32
U32	U8, S8, U16, S16, S32, F16, F32
S32	U8, S8, U16, S16, U32, F16, F32
F16	U8, S8, U16, S16, U32, F32
F32	U8, S8, U16, S16, U32, F16

DepthToSpaceLayer

Depth to Space transformation.

ANEURALNETWORKS_DEPTH_TO_SPACE

NEDepthToSpaceLayer

NHWC
NCHW

src	dst
All	All

CLDepthToSpaceLayer

NHWC
NCHW

src	dst
All	All

DepthwiseConvolutionLayer

Function to perform depthwise separable convolution.

ANEURALNETWORKS_DEPTHWISE_CONV_2D

NEDepthwiseConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

CLDepthwiseConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

DequantizationLayer

Function to dequantize the values in a tensor.

ANEURALNETWORKS_DEQUANTIZE

NEDequantizationLayer

src	dst
QASYMM8	F16, F32
QASYMM8_SIGNED	F16, F32
QSYMM8_PER_CHANNEL	F16, F32
QSYMM8	F16, F32
QSYMM16	F16, F32

CLDequantizationLayer

src	dst
QASYMM8	F16, F32
QASYMM8_SIGNED	F16, F32
QSYMM8_PER_CHANNEL	F16, F32
QSYMM8	F16, F32
QSYMM16	F16, F32

DetectionPostProcessLayer

Function to generate the detection output based on center size encoded boxes, class prediction and anchors by doing non maximum suppression (NMS).

ANEURALNETWORKS_DETECTION_POSTPROCESSING

NEDetectionPostProcessLayer

src0 - src2	dst0 - dst3
QASYMM8	F32
QASYMM8_SIGNED	F32
F32	F32

DirectConvolutionLayer

Function to compute direct convolution.

ANEURALNETWORKS_CONV_2D

NEDirectConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32

CLDirectConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED

DirectDeconvolutionLayer

Function to run the deconvolution layer.

ANEURALNETWORKS_TRANSPOSE_CONV_2D

CLDirectDeconvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

ElementWiseOperations

Function to perform in Cpu: - Div - Max - Min - Pow - SquaredDiff - Comparisons (Equal, greater, greater_equal, less, less_equal, not_equal) Function to perform in CL: - Add - Sub - Div - Max - Min - Pow - SquaredDiff

ANEURALNETWORKS_MAXIMUM
ANEURALNETWORKS_MINIMUM
ANEURALNETWORKS_POW
ANEURALNETWORKS_DIV
ANEURALNETWORKS_ADD
ANEURALNETWORKS_SUB
ANEURALNETWORKS_EQUAL
ANEURALNETWORKS_GREATER
ANEURALNETWORKS_GREATER_EQUAL
ANEURALNETWORKS_LESS
ANEURALNETWORKS_LESS_EQUAL
ANEURALNETWORKS_NOT_EQUAL

NEElementwiseMax

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
S32	S32	S32
S16	S16	S16
F16	F16	F16
F32	F32	F32

NEElementwiseMin

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
S32	S32	S32
S16	S16	S16
F16	F16	F16
F32	F32	F32

NEElementwiseSquaredDiff

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
S32	S32	S32
S16	S16	S16
F16	F16	F16
F32	F32	F32

NEElementwiseDivision

src0	src1	dst
F16	F16	F16
F32	F32	F32

NEElementwisePower

src0	src1	dst
F16	F16	F16
F32	F32	F32

NEElementwiseComparison

src0	src1	dst
QASYMM8	QASYMM8	U8
QASYMM8_SIGNED	QASYMM8_SIGNED	U8
S32	S32	U8
U8	U8	U8
S16	S16	U8
F16	F16	U8
F32	F32	U8

CLArithmeticAddition

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16	QASYMM16
U8	U8	U8
U8	U8	S16
U8	S16	S16
S16	U8	S16
S16	S16	S16
S32	S32	S32
F16	F16	F16
F32	F32	F32

CLArithmeticSubtraction

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16	QASYMM16
U8	U8	U8
U8	U8	S16
U8	S16	S16
S16	U8	S16
S16	S16	S16
S32	S32	S32
F16	F16	F16
F32	F32	F32

CLArithmeticDivision

src0	src1	dst
F16	F16	F16
F32	F32	F32

CLElementwiseMax

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16	QASYMM16
U8	U8	U8
S16	S16	S16
S32	S32	S32
U32	U32	U32
F16	F16	F16
F32	F32	F32

CLElementwiseMin

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16	QASYMM16
U8	U8	U8
S16	S16	S16
S32	S32	S32
U32	U32	U32
F16	F16	F16
F32	F32	F32

CLElementwiseSquaredDiff

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16	QASYMM16
U8	U8	U8
S16	S16	S16
F16	F16	F16
F32	F32	F32

CLElementwisePower

src0	src1	dst
F16	F16	F16
F32	F32	F32

ElementwiseUnaryLayer

Function to perform: - Rsqrt - Exp - Neg - Log - Abs - Round - Sin

ANEURALNETWORKS_ABS
ANEURALNETWORKS_EXP
ANEURALNETWORKS_LOG
ANEURALNETWORKS_NEG
ANEURALNETWORKS_RSQRT
ANEURALNETWORKS_SIN

NEElementwiseUnaryLayer

src	dst
F16	F16
F32	F32
S32	S32

CLRsqrtLayer

src	dst
F16	F16
F32	F32

CLExpLayer

src	dst
F16	F16
F32	F32

CLNegLayer

src	dst
F16	F16
F32	F32

CLSinLayer

src	dst
F16	F16
F32	F32

CLLogLayer

src	dst
F16	F16
F32	F32

CLAbsLayer

src	dst
F16	F16
F32	F32

CLRoundLayer

src	dst
F16	F16
F32	F32

FFT1D

Fast Fourier Transform 1D.

NEFFT1D

src	dst
F32	F32

CLFFT1D

src	dst
F32	F32
F16	F16

FFT2D

Fast Fourier Transform 2D.

NEFFT2D

src	dst
F32	F32

CLFFT2D

src	dst
F32	F32
F16	F16

FFTConvolutionLayer

Fast Fourier Transform Convolution.

ANEURALNETWORKS_CONV_2D

NEFFTConvolutionLayer

src	dst
F32	F32

CLFFTConvolutionLayer

src	dst
F32	F32
F16	F16

Fill

Set the values of a tensor with a given value.

ANEURALNETWORKS_FILL

NEFill

src	dst
All	All

CLFill

src	dst
All	All

FillBorder

Function to .

NEFillBorder

src	dst
All	All

CLFillBorder

src	dst
All	All

FlattenLayer

Reshape a tensor to be 1D

ANEURALNETWORKS_RESHAPE

NEFlattenLayer

src	dst
All	All

CLFlattenLayer

src	dst
All	All

Floor

Round the value to the lowest number.

ANEURALNETWORKS_FLOOR

NEFloor

src	dst
F32	F32
F16	F16

CLFloor

src	dst
F32	F32
F16	F16

FullyConnectedLayer

Function to perform a fully connected / dense layer.

ANEURALNETWORKS_FULLY_CONNECTED

NEFullyConnectedLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED

CLFullyConnectedLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED

FuseBatchNormalization

Function to fuse the batch normalization node to a preceding convolution node.

NEFuseBatchNormalization

NHWC
NCHW

src	dst
F32	F32
F16	F16

CLFuseBatchNormalization

NHWC
NCHW

src	dst
F32	F32
F16	F16

Gather

Performs the Gather operation along the chosen axis.

ANEURALNETWORKS_GATHER

NEGather

src	dst
All	All

CLGather

src	dst
All	All

GEMM

General Matrix Multiplication.

NEGEMM

src0	src1	src2	dst
F32	F32	F32	F32
F16	F16	F16	F16
BFLOAT16	BFLOAT16	BFLOAT16	BFLOAT16

CLGEMM

src0	src1	src2	dst
F32	F32	F32	F32
F16	F16	F16	F16

GEMMConv2D

General Matrix Multiplication.

ANEURALNETWORKS_CONV_2D

NEGEMMConv2d

src0	src1	src2	dst
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
F16	F16	F16	F16
F32	F32	F32	F32
BFLOAT16	BFLOAT16	BFLOAT16	BFLOAT16

GEMMConvolutionLayer

General Matrix Multiplication.

ANEURALNETWORKS_CONV_2D

NEGEMMConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
BFLOAT16	BFLOAT16	BFLOAT16	BFLOAT16
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

CLGEMMConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

GEMMDeconvolutionLayer

General Matrix Multiplication.

ANEURALNETWORKS_TRANSPOSE_CONV_2D

CLGEMMDeconvolutionLayer

NHWC

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED

GEMMLowpMatrixMultiplyCore

General Matrix Multiplication.

NEGEMMLowpMatrixMultiplyCore

NHWC
NCHW

src0	src1	src2	dst
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8	QSYMM8	S32	QASYMM8
QASYMM8	QASYMM8	S32	S32
QASYMM8	QSYMM8_PER_CHANNEL	S32	S32
QASYMM8	QSYMM8	S32	S32
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	S32
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	S32
QASYMM8_SIGNED	QSYMM8	S32	S32

CLGEMMLowpMatrixMultiplyCore

NHWC
NCHW

src0	src1	src2	dst
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8	QSYMM8	S32	QASYMM8
QASYMM8	QASYMM8	S32	S32
QASYMM8	QSYMM8_PER_CHANNEL	S32	S32
QASYMM8	QSYMM8	S32	S32
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	S32
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	S32
QASYMM8_SIGNED	QSYMM8	S32	S32

GEMMLowpOutputStage

General Matrix Multiplication.

NEGEMMLowpOutputStage

src0	src1	dst
S32	S32	QASYMM8
S32	S32	QASYMM8_SIGNED
S32	S32	QSYMM16

CLGEMMLowpOutputStage

src0	src1	dst
S32	S32	QASYMM8
S32	S32	QASYMM8_SIGNED
S32	S32	QSYMM16

GenerateProposalsLayer

Function to generate proposals for a RPN (Region Proposal Network).

ANEURALNETWORKS_GENERATE_PROPOSALS

NEGenerateProposalsLayer

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QSYMM8	QSYMM16	QASYMM8

CLGenerateProposalsLayer

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QSYMM8	QSYMM16	QASYMM8

InstanceNormalizationLayer

Function to perform a Instance normalization on a given axis.

ANEURALNETWORKS_INSTANCE_NORMALIZATION

NEInstanceNormalizationLayer

NHWC
NCHW

src	dst
F16	F16
F32	F32

CLInstanceNormalizationLayer

NHWC
NCHW

src	dst
F16	F16
F32	F32

L2NormalizeLayer

Function to perform a L2 normalization on a given axis.

ANEURALNETWORKS_L2_NORMALIZATION

NEL2NormalizeLayer

NHWC
NCHW

src	dst
F16	F16
F32	F32

CLL2NormalizeLayer

NHWC
NCHW

src	dst
F16	F16
F32	F32

Logical

Function to perform: - Logical AND - Logical OR - Logical NOT

NELogicalAnd

src0	src1	dst
U8	U8	U8

NELogicalOr

src0	src1	dst
U8	U8	U8

NELogicalNot

src	dst
U8	U8

LogicalAnd

Function to perform Logical AND.

CLLogicalAnd

src0	src1	dst
U8	U8	U8

LogicalOr

Function to perform Logical OR.

CLLogicalOr

src0	src1	dst
U8	U8	U8

LogicalNot

Function to perform Logical NOT.

CLLogicalNot

src	dst
U8	U8

LSTMLayer

Function to perform a single time step in a Long Short-Term Memory (LSTM) layer.

ANEURALNETWORKS_LSTM

NELSTMLayer

src0 - src13	dst0 - dst3
F16	F16
F32	F32

CLLSTMLayer

src0 - src13	dst0 - dst3
F16	F16
F32	F32

LSTMLayerQuantized

Function to perform quantized LSTM (Long Short-Term Memory)

ANEURALNETWORKS_QUANTIZED_LSTM
ANEURALNETWORKS_QUANTIZED_16BIT_LSTM

NELSTMLayerQuantized

src0 - src8	src9 - src12	src13	src14	dst0	dst1
QASYMM8	S32	QSYMM16	QASYMM8	QSYMM16	QASYMM8

CLLSTMLayerQuantized

src0 - src8	src9 - src12	src13	src14	dst0	dst1
QASYMM8	S32	QSYMM16	QASYMM8	QSYMM16	QASYMM8

MaxUnpoolingLayer

Function to perform MaxUnpooling.

NEMaxUnpoolingLayer

NHWC
NCHW

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

CLMaxUnpoolingLayer

NHWC
NCHW

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

MeanStdDevNormalizationLayer

Function to execute mean and standard deviation normalization.

NEMeanStdDevNormalizationLayer

NHWC
NCHW

src	dst
F32	F32
F16	F16

CLMeanStdDevNormalizationLayer

NHWC
NCHW

src	dst
F32	F32
F16	F16

NormalizationLayer

Function to compute normalization layer.

ANEURALNETWORKS_LOCAL_RESPONSE_NORMALIZATION

NENormalizationLayer

NHWC
NCHW

src	dst
F32	F32
F16	F16

CLNormalizationLayer

NHWC
NCHW

src	dst
F32	F32
F16	F16

PadLayer

Function to pad a tensor.

ANEURALNETWORKS_PAD
ANEURALNETWORKS_PAD_V2

NEPadLayer

NHWC
NCHW

src	dst
All	All

CLPadLayer

NHWC
NCHW

src	dst
All	All

Permute

Function to transpose an ND tensor.

ANEURALNETWORKS_TRANSPOSE

NEPermute

NHWC
NCHW

src	dst
All	All

CLPermute

NHWC
NCHW

src	dst
All	All

PixelWiseMultiplication

Function to performe a multiplication.

ANEURALNETWORKS_MUL

NEPixelWiseMultiplication

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16	QASYMM16
QSYMM16	QSYMM16	S32
U8	U8	U8
U8	U8	S16
U8	S16	S16
S16	U8	S16
S16	S16	S16
F16	F16	F16
F32	S32	F32

CLPixelWiseMultiplication

src0	src1	dst
QASYMM8	QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	QASYMM8_SIGNED
QSYMM16	QSYMM16	QASYMM16
QSYMM16	QSYMM16	S32
U8	U8	U8
U8	U8	S16
U8	S16	S16
S16	U8	S16
S16	S16	S16
F16	F16	F16
F32	S32	F32

PoolingLayer

Function to performe pooling with the specified pooling operation.

ANEURALNETWORKS_AVERAGE_POOL_2D
ANEURALNETWORKS_L2_POOL_2D
ANEURALNETWORKS_MAX_POOL_2D

NEPoolingLayer

NHWC
NCHW

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

CLPoolingLayer

NHWC
NCHW

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

PReluLayer

Function to compute the activation layer with the PRELU activation function.

ANEURALNETWORKS_PRELU

NEPReluLayer

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

CLPReluLayer

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

PriorBoxLayer

Function to compute prior boxes and clip.

NEPriorBoxLayer

NHWC
NCHW

src0	src1	dst
F32	F32	F32

CLPriorBoxLayer

NHWC
NCHW

src0	src1	dst
F32	F32	F32

QLSTMLayer

Function to perform quantized LSTM (Long Short-Term Memory).

ANEURALNETWORKS_QUANTIZED_LSTM
ANEURALNETWORKS_QUANTIZED_16BIT_LSTM

NEQLSTMLayer

src0	src1 - src6	src7 -src9	src10	src11	dst0	dst1 - dst2
QASYMM8_SIGNED	QASYMM8	S32	QSYMM16	QASYMM8_SIGNED	QSYMM16	QASYMM8_SIGNED

CLQLSTMLayer

src0	src1 - src6	src7 -src9	src10	src11	dst0	dst1 - dst2
QASYMM8_SIGNED	QASYMM8	S32	QSYMM16	QASYMM8_SIGNED	QSYMM16	QASYMM8_SIGNED

QuantizationLayer

Function to perform quantization layer

ANEURALNETWORKS_QUANTIZE

NEQuantizationLayer

src	dst
QASYMM8	QASYMM8, QASYMM8_SIGNED, QASYMM16
QASYMM8_SIGNED	QASYMM8, QASYMM8_SIGNED, QASYMM16
F16	QASYMM8, QASYMM8_SIGNED, QASYMM16
F32	QASYMM8, QASYMM8_SIGNED, QASYMM16

CLQuantizationLayer

src	dst
QASYMM8	QASYMM8, QASYMM8_SIGNED, QASYMM16
QASYMM8_SIGNED	QASYMM8, QASYMM8_SIGNED, QASYMM16
F16	QASYMM8, QASYMM8_SIGNED, QASYMM16
F32	QASYMM8, QASYMM8_SIGNED, QASYMM16

Range

Function to generates a sequence of numbers starting from START and extends by increments of 'STEP' up to but not including 'END'.

NERange

dst
U8
S8
U16
S16
U32
S32
F16
F32

CLRange

dst
U8
S8
QASYMM8
U16
S16
U32
S32
F16
F32

ReduceMean

Function to performe reduce mean operation.

ANEURALNETWORKS_MEAN

NEReduceMean

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

CLReduceMean

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

ReductionOperation

Function to performe reduce with the following operations - ARG_IDX_MAX: Index of the max value - ARG_IDX_MIN: Index of the min value - MEAN_SUM: Mean of sum - PROD: Product - SUM_SQUARE: Sum of squares - SUM: Sum - MIN: Min - MAX: Max

ANEURALNETWORKS_REDUCE_ALL
ANEURALNETWORKS_REDUCE_ANY
ANEURALNETWORKS_REDUCE_MAX
ANEURALNETWORKS_REDUCE_MIN
ANEURALNETWORKS_REDUCE_PROD
ANEURALNETWORKS_REDUCE_SUM

NEReductionOperation

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32
S32	S32

CLReductionOperation

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32
S32	S32

ReorgLayer

Performs a reorganization layer of input tensor to the output tensor.

NEReorgLayer

NHWC
NCHW

src	dst
All	All

CLReorgLayer

NHWC
NCHW

src	dst
All	All

ReshapeLayer

Function to reshape a tensor.

ANEURALNETWORKS_RESHAPE
ANEURALNETWORKS_SQUEEZE

NEReshapeLayer

src	dst
All	All

CLReshapeLayer

src	dst
All	All

Reverse

Function to reverse tensor according to axis.

NEReverse

src0	src1	dst
All	U32	All

CLReverse

src0	src1	dst
All	U32	All

RNNLayer

Function to perform recurrent neural network layer.

ANEURALNETWORKS_RNN

NERNNLayer

NHWC
NCHW

src0	src1	src2	src3	dst0	dst1
F16	F16	F16	F16	F16	F16
F32	F32	F32	F32	F32	F32

CLRNNLayer

NHWC
NCHW

src0	src1	src2	src3	dst0	dst1
F16	F16	F16	F16	F16	F16
F32	F32	F32	F32	F32	F32

ROIAlignLayer

Function to perform ROI alignment.

ANEURALNETWORKS_ROI_ALIGN

NEROIAlignLayer

src0	src1	dst
F16	F16	F16
F32	F32	F32
QASYMM8	QASYMM16	QASYMM8
QASYMM8_SIGNED	QASYMM16	QASYMM8_SIGNED

CLROIAlignLayer

src0	src1	dst
F16	F16	F16
F32	F32	F32
QASYMM8	QASYMM16	QASYMM8
QASYMM8_SIGNED	QASYMM16	QASYMM8_SIGNED

ROIPoolingLayer

Function to perform ROI pooling.

ANEURALNETWORKS_ROI_POOLING

NEROIPoolingLayer

src0	src1	dst
F32	U16	F32
QASYMM8	U16	QASYMM8

CLROIPoolingLayer

src0	src1	dst
F16	U16	F16
F32	U16	F32
QASYMM8	U16	QASYMM8

Scale

Function to perform resize a tensor using to interpolate: - Bilinear - Nearest neighbor

ANEURALNETWORKS_RESIZE_BILINEAR
ANEURALNETWORKS_RESIZE_NEAREST_NEIGHBOR

NEScale

NHWC
NCHW

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32
U8	U8
S16	S16

CLScale

NHWC
NCHW

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32
U8	U8
S16	S16

Select

Function to select values from 2 tensors depending on an input tensor of booleans.

ANEURALNETWORKS_SELECT

NESelect

src0	src1	src2	dst
U8	All	All	All

CLSelect

src0	src1	src2	dst
U8	All	All	All

Slice

Function to perform tensor slicing.

ANEURALNETWORKS_SLICE

NESlice

src	dst
All	All

CLSlice

src	dst
All	All

SoftmaxLayer

Function to compute a SoftmaxLayer and a Log SoftmaxLayer.

ANEURALNETWORKS_LOG_SOFTMAX
ANEURALNETWORKS_SOFTMAX

NESoftmaxLayerGeneric

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

CLSoftmaxLayerGeneric

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

SpaceToBatchLayer

Function to divide a tensor spatially.

ANEURALNETWORKS_SPACE_TO_BATCH_ND

NESpaceToBatchLayer

NHWC
NCHW

src0	src1	src2	dst
All	S32	S32	All

CLSpaceToBatchLayer

NHWC
NCHW

src0	src1	src2	dst
All	S32	S32	All

SpaceToDepthLayer

Function to rearrange blocks of spatial data into depth.

ANEURALNETWORKS_SPACE_TO_DEPTH

NESpaceToDepthLayer

NHWC
NCHW

src	dst
All	All

CLSpaceToDepthLayer

NHWC
NCHW

src	dst
All	All

Split

Function to split a tensor along a given axis.

ANEURALNETWORKS_SPLIT

NESplit

src	dst
All	All

CLSplit

src	dst
All	All

StackLayer

Function to stack tensors along an axis.

NEStackLayer

src	dst
All	All

CLStackLayer

src	dst
All	All

StridedSlice

Function to extract a strided slice of a tensor.

ANEURALNETWORKS_STRIDED_SLICE

NEStridedSlice

src	dst
All	All

CLStridedSlice

src	dst
All	All

Tile

Function to construct a tensor by tiling a given tensor.

ANEURALNETWORKS_TILE

NETile

src	dst
All	All

CLTile

src	dst
All	All

Transpose

Function to transpose a 2D tensor.

ANEURALNETWORKS_TRANSPOSE

NETranspose

src	dst
All	All

CLTranspose

src	dst
All	All

Unstack

Function to unpack a rank-R tensor into rank-(R-1) tensors.

NEUnstack

src	dst
All	All

CLUnstack

src	dst
All	All

WinogradConvolutionLayer

Function to do Winograd Convolution.

ANEURALNETWORKS_CONV_2D

NEWinogradConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32

CLWinogradConvolutionLayer

NHWC
NCHW

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32

WinogradInputTransform

Function to.

CLWinogradInputTransform

NHWC
NCHW

src	dst
F16	F16
F32	F32

*/ } // namespace