aboutsummaryrefslogtreecommitdiff
path: root/src/cpu
AgeCommit message (Expand)Author
2024-04-15Add s8f32 kernels and dynamic QuantizationInfoJonathan Deakin
2024-04-12Accumulation in Cpu Gemm kernels is not supported for quantized kernels in aa...Radu Salavat
2024-04-11Add SME2 implementation of softmax for FP16Gunes Bayir
2024-04-11Add in place summation to CPU GEMM kernelsRadu Salavat
2024-04-04Parallelise im2col along dimensions with higher number of iterationsMilos Puzovic
2024-04-02Add SME2 implementation of softmax for FP32Viet-Hoa Do
2024-03-21[ONCPUML-1451] Add matmul kernel to enable bf16 to bf16 operations via PyTorc...Renato Arantes
2024-03-20Make Cpu/Gpu/Ref scalar/vectoral S32 division consistentGunes Bayir
2024-03-19Fix overflow in NEMeanStdDevNormalizationKernelPablo Marquez Tello
2024-03-14Fix validation in pool2d assembly wrapperPablo Marquez Tello
2024-03-12Optimize CpuSoftmaxKernel for axis != 0 and neon kernelsOmar Al Khatib
2024-03-11Prefer indirect Gemm vs. Direct convolution if supportedGunes Bayir
2024-03-04Fix performance regression in fixed-format kernelsGunes Bayir
2024-02-21Integrate new pretranspose_b_array with extra fused transpose of BGunes Bayir
2024-02-20Requantization cases for offset changes onlyMohammed Suhail Munshi
2024-02-12Fix parallel depthwise perf regression from 2db938cJonathan Deakin
2024-02-07Parallelize CPU depthwise over batch if only 1 rowJonathan Deakin
2024-02-05Fix leftover cols in CpuGemmLowpMatrixBReductionKernelJonathan Deakin
2024-01-23Fix for Logically dead code detected in Coverity checksAnitha Raj
2024-01-10Use look up table for fp16 activationMohammed Suhail Munshi
2024-01-04Prevent RELU from being processed thru LUT in INT8Sangwon Ha
2023-12-12Winograd changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-12-07Optimize CPU depth-to-spaceViet-Hoa Do
2023-12-06Revert "thread_local _custom_scheduler"Pablo Marquez Tello
2023-12-05Optimize CpuSoftmaxKernel for axis=0Gunes Bayir
2023-11-27BatchNorm changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-27CpuMul changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-24thread_local _custom_schedulerDavid Svantesson
2023-11-16NormalizationLayer changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-15Fix various coverity issuesSiCong Li
2023-11-10Fix CpuGemmConv2d int8 segfaultSiCong Li
2023-11-09Pooling changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-09DepthwiseConvolution changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-11-08Optimize CpuGemmConv2d start-up timeSiCong Li
2023-10-30DirectConv and Im2Col changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-20FuseBatchNorm changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-13Fix build error in CpuScalePablo Marquez Tello
2023-10-12Scale changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-10Fix build errorPablo Marquez Tello
2023-10-10CpuSubKernel changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-09Pool2d changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-10-05Optimize CLTranspose operatorJakub Sujak
2023-10-02Optimize CL and Neon Winograd testsGunes Bayir
2023-09-28Apply clang-format on repositoryFelix Thomasmathibalan
2023-09-26Re-arrange header inclusion orderFelix Thomasmathibalan
2023-09-26Select changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-09-26Maxunpooling changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-09-21L2Norm changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-09-21Gemm changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-09-20Fix the validation issue in AddMulAdd fused kernelGunes Bayir