aboutsummaryrefslogtreecommitdiff
path: root/src/cpu
AgeCommit message (Expand)Author
12 daysUse lookup table for Fp16 Tanh activation in hardware with SVEGunes Bayir
2024-05-17Fix linking error to fp16_run_dequantization_core()Ramy Elgammal
2024-05-16Refactor Dequantize to enable FP16 kernel in v8a multi_isa buildsRamy Elgammal
2024-05-15Fix nightly build errorPablo Marquez Tello
2024-05-14Rework CpuQuantizeKernel to enable FP16 in multi_isa buildsRamy Elgammal
2024-05-14Refactor arm_gemm to enable FP16 in all multi_isa buildsPablo Marquez Tello
2024-05-13Fix ReductionLayer FP16 for armv8a multi_isa buildsRamy Elgammal
2024-05-08Add SME2 implementation of Softmax for QASYMM8 and QASYMM8_SIGNED.Omar Al Khatib
2024-04-15Add s8f32 kernels and dynamic QuantizationInfoJonathan Deakin
2024-04-12Accumulation in Cpu Gemm kernels is not supported for quantized kernels in aa...Radu Salavat
2024-04-11Add SME2 implementation of softmax for FP16Gunes Bayir
2024-04-11Add in place summation to CPU GEMM kernelsRadu Salavat
2024-04-04Parallelise im2col along dimensions with higher number of iterationsMilos Puzovic
2024-04-02Add SME2 implementation of softmax for FP32Viet-Hoa Do
2024-03-21[ONCPUML-1451] Add matmul kernel to enable bf16 to bf16 operations via PyTorc...Renato Arantes
2024-03-20Make Cpu/Gpu/Ref scalar/vectoral S32 division consistentGunes Bayir
2024-03-19Fix overflow in NEMeanStdDevNormalizationKernelPablo Marquez Tello
2024-03-14Fix validation in pool2d assembly wrapperPablo Marquez Tello
2024-03-12Optimize CpuSoftmaxKernel for axis != 0 and neon kernelsOmar Al Khatib
2024-03-11Prefer indirect Gemm vs. Direct convolution if supportedGunes Bayir
2024-03-04Fix performance regression in fixed-format kernelsGunes Bayir
2024-02-21Integrate new pretranspose_b_array with extra fused transpose of BGunes Bayir
2024-02-20Requantization cases for offset changes onlyMohammed Suhail Munshi
2024-02-12Fix parallel depthwise perf regression from 2db938cJonathan Deakin
2024-02-07Parallelize CPU depthwise over batch if only 1 rowJonathan Deakin
2024-02-05Fix leftover cols in CpuGemmLowpMatrixBReductionKernelJonathan Deakin
2024-01-23Fix for Logically dead code detected in Coverity checksAnitha Raj
2024-01-10Use look up table for fp16 activationMohammed Suhail Munshi
2024-01-04Prevent RELU from being processed thru LUT in INT8Sangwon Ha
2023-12-12Winograd changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-12-07Optimize CPU depth-to-spaceViet-Hoa Do
2023-12-06Revert "thread_local _custom_scheduler"Pablo Marquez Tello
2023-12-05Optimize CpuSoftmaxKernel for axis=0Gunes Bayir
2023-11-27BatchNorm changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-27CpuMul changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-24thread_local _custom_schedulerDavid Svantesson
2023-11-16NormalizationLayer changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-15Fix various coverity issuesSiCong Li
2023-11-10Fix CpuGemmConv2d int8 segfaultSiCong Li
2023-11-09Pooling changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-09DepthwiseConvolution changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-08Optimize CpuGemmConv2d start-up timeSiCong Li
2023-10-30DirectConv and Im2Col changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-20FuseBatchNorm changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-13Fix build error in CpuScalePablo Marquez Tello
2023-10-12Scale changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-10Fix build errorPablo Marquez Tello
2023-10-10CpuSubKernel changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-09Pool2d changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-05Optimize CLTranspose operatorJakub Sujak