aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Expand)Author
2021-08-24Remove map/unmap overhead for input/output accessor when using DummyAccessorGiorgio Arena
2021-08-24Re-use auxiliary memory withing CpuWinogradConv2d operatorsGeorgios Pinitas
2021-08-23Remove padding from ClScaleKernelGiorgio Arena
2021-08-20Rename [Cl|Cpu]GemmConvolution to [Cl|Gpu]GemmConv2dGeorgios Pinitas
2021-08-19Address comments on avoiding releasing weights if used by multiple functionsGiorgio Arena
2021-08-18Enable fast_math on CpuGemmConvolutionGeorgios Pinitas
2021-08-18Update the heuristic to call direct convolution in clConv2dGian Marco Iodice
2021-08-18Retain weights in ClGemm when reconfiguring the operator with retentionGeorgios Pinitas
2021-08-13Avoid releasing weights if they are used by multiple functionsGeorgios Pinitas
2021-08-13Ensure correct transformed matrices are used in CpuGemmConvolutionGeorgios Pinitas
2021-08-12Ensure that correct transformed matrices are used in CpuFullyConnectedGeorgios Pinitas
2021-08-11Fix performance regression due to clFinish()Gian Marco Iodice
2021-08-10Fix compiler error in CLActivationLayerPablo Marquez Tello
2021-08-06Fix compiler error in GCC 7.4 + Ubuntu 16Pablo Marquez Tello
2021-08-04Remove 21.08 deprecated codeFreddie Liardet
2021-08-04Report error for unsupported non-constant weights in CpuFullyConnectedMichele Di Giorgio
2021-08-04Fix depthwise convolution assembly kernelsFreddie Liardet
2021-08-04Avoid over-allocation of temporary buffers within CpuWinogradConv2dGeorgios Pinitas
2021-08-04Implement Operator APIGeorgios Pinitas
2021-08-02Add missing limits includeFreddie Liardet
2021-08-02Benchmark and set default LWS for GEMM, Direct convolution and WinogradGiorgio Arena
2021-08-02Port CLConvolutionLayerSheri Zhang
2021-07-30Port ClFullyConnected to new APIGeorgios Pinitas
2021-07-30Reintroduce implementation of NEConvolutionLayer::get_convolution_methodMichele Di Giorgio
2021-07-30Compilation issue: neon=1 armv8.2 on Android with NDKr18beta1Gian Marco Iodice
2021-07-29Fix A55 performance constant for fp16 hybrid gemm kernelGeorgios Pinitas
2021-07-29Port NEConvolutionLayerMichalis Spyrou
2021-07-28Create custom flags for enabling fp16 supportGeorgios Pinitas
2021-07-28Reduce binary footprint of CpuConvertFullyConnectedWeightsKernelMichele Di Giorgio
2021-07-28Fix bare metal build issuesFreddie Liardet
2021-07-28Fix cpu GEMM fp16 issueFreddie Liardet
2021-07-28Reorganize the kernels into nhwc, nchw and common foldersAdnan AlSinan
2021-07-28Remove generated kernels that overlap hand-written onesGeorgios Pinitas
2021-07-27Fix memory lifetime issueGeorgios Pinitas
2021-07-27Port CLGEMMConvolutionLayerManuel Bottini
2021-07-27Dispatch Conv2d using the Direct method when necessaryGeorgios Pinitas
2021-07-27Update GEMM assembly performance parametersGeorgios Pinitas
2021-07-26Add missing limits includeFreddie Liardet
2021-07-26Fix allocation of prepare tensor on ClWinogradConv2dGeorgios Pinitas
2021-07-25Reorganize the kernels into nhwc, nchw and common foldersAdnan AlSinan
2021-07-23Avoid allocation of auxiliary memory in CpuGemmConvolutionGeorgios Pinitas
2021-07-23Fix vector_length identification mechanism for SVEGeorgios Pinitas
2021-07-23Port NEFullyConnectedLayer to memory injecting interfaceMichele Di Giorgio
2021-07-23Pass fast math flag for correct GEMM3D validation supportGeorgios Pinitas
2021-07-23Fix bare metal build errorFreddie Liardet
2021-07-22Expose fast_math mode for GEMM through BFloat16Georgios Pinitas
2021-07-22Inject temporary tensors to pack in they don't exist in CpuSoftmaxGeorgios Pinitas
2021-07-22Port ClGemmLowp to new APIGeorgios Pinitas
2021-07-22Fix oclgrind int overflow warningFreddie Liardet
2021-07-22Update GEMM assembly kernelsGeorgios Pinitas