aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Expand)Author
2024-03-27Added new NEON fixed format fast math mode hybrid kernel with maximum height ...Milos Puzovic
2024-03-25Adds Tests and reference implementation for scatter operator with 1D tensors.Mohammed Suhail Munshi
2024-03-21Add skeleton for CLScatter op, reference and testsMohammed Suhail Munshi
2024-03-21[ONCPUML-1451] Add matmul kernel to enable bf16 to bf16 operations via PyTorc...Renato Arantes
2024-03-20Make Cpu/Gpu/Ref scalar/vectoral S32 division consistentGunes Bayir
2024-03-19Fix overflow in NEMeanStdDevNormalizationKernelPablo Marquez Tello
2024-03-18Fix quant. gemv kernel driver by adding set_quantized_bias()Gunes Bayir
2024-03-14arm_gemm: Fix bias handling for sme2 FP16 GEMV.David Mansell
2024-03-14Fix validation in pool2d assembly wrapperPablo Marquez Tello
2024-03-12Optimize CpuSoftmaxKernel for axis != 0 and neon kernelsOmar Al Khatib
2024-03-12Fix WoA nightly failurePablo Marquez Tello
2024-03-11Prefer indirect Gemm vs. Direct convolution if supportedGunes Bayir
2024-03-04Disable FP16 on 32 bitPablo Marquez Tello
2024-03-04Fix performance regression in fixed-format kernelsGunes Bayir
2024-03-01Set Neon™ as present for WoAPablo Marquez Tello
2024-02-22Fix segfault in DWC in WoAPablo Marquez Tello
2024-02-22Fix OpenBSD® build failure caused by patch 11144Gunes Bayir
2024-02-21Integrate new pretranspose_b_array with extra fused transpose of BGunes Bayir
2024-02-20Requantization cases for offset changes onlyMohammed Suhail Munshi
2024-02-14Fix compiler errors in cl-clangPablo Marquez Tello
2024-02-12Fix parallel depthwise perf regression from 2db938cJonathan Deakin
2024-02-09Add support for QSYMM8 in ClCastKernelPablo Marquez Tello
2024-02-09Remove CKW prototype and Template WriterGunes Bayir
2024-02-08Fix the bug in GpuTanh operator in dynamic fusionGunes Bayir
2024-02-08Mark GpuSoftmax and GpuReshape as not supportedGunes Bayir
2024-02-07Parallelize CPU depthwise over batch if only 1 rowJonathan Deakin
2024-02-06arm_gemm: SME: Remove artificial single-thread constraint on quantized int8 k...David Mansell
2024-02-05Fix leftover cols in CpuGemmLowpMatrixBReductionKernelJonathan Deakin
2024-02-01Use the stable CKW API in the GPU dynamic fusion backendGunes Bayir
2024-01-25arm_gemm: convolution: optimize convolver.hpp.David Mansell
2024-01-23Fix for Logically dead code detected in Coverity checksAnitha Raj
2024-01-23Fix for unchecked return value detected in Coverity checks.Anitha Raj
2024-01-23Make GpuWorkloadContext own all tensor info objectsViet-Hoa Do
2024-01-18Fix divide-by-zero compilation errorViet-Hoa Do
2024-01-17Fix minor issue, clean lut codeMohammed Suhail Munshi
2024-01-12Fix potential threading issue in LUTManagerMohammed Suhail Munshi
2024-01-12[ONCPUML-1387] Add ACL based reorder for f32 to bf16 data type conversion.Renato Arantes
2024-01-10Fix compilation error on GCC 13.2Jakub Sujak
2024-01-10Use look up table for fp16 activationMohammed Suhail Munshi
2024-01-04Prevent RELU from being processed thru LUT in INT8Sangwon Ha
2023-12-22Fix nightly issue caused by gemm_reshaped_only_rhs_mmul kernelGunes Bayir
2023-12-22Add Mali™-G720 and Mali™-G620 as GpuTargetsGunes Bayir
2023-12-15Fix nightly bug caused by not validation 3d cases for input tensorGunes Bayir
2023-12-15Revert "Fix nightly bug caused by wrong validation in Gemm mmul kernel"Gunes Bayir
2023-12-14Fix validation error in CL generate proposals kernelGunes Bayir
2023-12-13Fix nightly bug caused by wrong validation in Gemm mmul kernelGunes Bayir
2023-12-12Winograd changes to enable fp16 in armv8a multi_isa buildsPablo Marquez Tello
2023-12-08Fix validation error in graph_ssd_mobilenetGunes Bayir
2023-12-08Fix unit tests failing in CL/UNIT/TensorAllocatorGunes Bayir
2023-12-07Optimize CPU depth-to-spaceViet-Hoa Do