aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Expand)Author
37 hoursImprove CPU extension detection on macosHEADrelease_candidatemainViet-Hoa Do
43 hoursScatterND fix for scalar casesGunes Bayir
4 daysMake quantization rounding consistentJonathan Deakin
4 daysAdd SME2 implementation of Softmax for QASYMM8 and QASYMM8_SIGNED.branches/arm_compute_24_05Omar Al Khatib
4 daysAdd batched indices support to Scatter GPU ImplementationMohammed Suhail Munshi
9 daysarm_gemm: fix SVE check on fast mode kernels.David Mansell
10 daysChange reorder implementation to be vector length agnostic for OHWIo8 reorderRadu Salavat
11 daysNew SME2 heuristics.David Mansell
12 daysAdd fp16 and integer data type support for ScatterNd in GpuGunes Bayir
13 daysDisable SME2 Gemmlowp s8f32 kernel selection in case results needs to be accu...Gunes Bayir
2024-04-26Disable SME2 Gemm kernel selection in case results needs to be accumulatedGunes Bayir
2024-04-25Add update/index/output (m+1)/2d/(m+n) support for CLScatterGunes Bayir
2024-04-25Add padding to the shift and multipliers buffersPablo Marquez Tello
2024-04-22Scatter GPU Kernel Implementation for 1D tensors.Mohammed Suhail Munshi
2024-04-16fix compilation errors on linux with gcc12Sunita Nadampalli
2024-04-15Add s8f32 kernels and dynamic QuantizationInfoJonathan Deakin
2024-04-12Accumulation in Cpu Gemm kernels is not supported for quantized kernels in aa...Radu Salavat
2024-04-11Add SME2 implementation of softmax for FP16Gunes Bayir
2024-04-11Add in place summation to CPU GEMM kernelsRadu Salavat
2024-04-05Fix compiler errorPablo Marquez Tello
2024-04-04Parallelise im2col along dimensions with higher number of iterationsMilos Puzovic
2024-04-02Add SME2 implementation of softmax for FP32Viet-Hoa Do
2024-03-27Added new NEON fixed format fast math mode hybrid kernel with maximum height ...Milos Puzovic
2024-03-25Adds Tests and reference implementation for scatter operator with 1D tensors.Mohammed Suhail Munshi
2024-03-21Add skeleton for CLScatter op, reference and testsMohammed Suhail Munshi
2024-03-21[ONCPUML-1451] Add matmul kernel to enable bf16 to bf16 operations via PyTorc...Renato Arantes
2024-03-20Make Cpu/Gpu/Ref scalar/vectoral S32 division consistentGunes Bayir
2024-03-19Fix overflow in NEMeanStdDevNormalizationKernelPablo Marquez Tello
2024-03-18Fix quant. gemv kernel driver by adding set_quantized_bias()Gunes Bayir
2024-03-14arm_gemm: Fix bias handling for sme2 FP16 GEMV.David Mansell
2024-03-14Fix validation in pool2d assembly wrapperPablo Marquez Tello
2024-03-12Optimize CpuSoftmaxKernel for axis != 0 and neon kernelsOmar Al Khatib
2024-03-12Fix WoA nightly failurePablo Marquez Tello
2024-03-11Prefer indirect Gemm vs. Direct convolution if supportedGunes Bayir
2024-03-04Disable FP16 on 32 bitPablo Marquez Tello
2024-03-04Fix performance regression in fixed-format kernelsGunes Bayir
2024-03-01Set Neon™ as present for WoAPablo Marquez Tello
2024-02-22Fix segfault in DWC in WoAPablo Marquez Tello
2024-02-22Fix OpenBSD® build failure caused by patch 11144Gunes Bayir
2024-02-21Integrate new pretranspose_b_array with extra fused transpose of BGunes Bayir
2024-02-20Requantization cases for offset changes onlyMohammed Suhail Munshi
2024-02-14Fix compiler errors in cl-clangPablo Marquez Tello
2024-02-12Fix parallel depthwise perf regression from 2db938cJonathan Deakin
2024-02-09Add support for QSYMM8 in ClCastKernelPablo Marquez Tello
2024-02-09Remove CKW prototype and Template WriterGunes Bayir
2024-02-08Fix the bug in GpuTanh operator in dynamic fusionGunes Bayir
2024-02-08Mark GpuSoftmax and GpuReshape as not supportedGunes Bayir
2024-02-07Parallelize CPU depthwise over batch if only 1 rowJonathan Deakin
2024-02-06arm_gemm: SME: Remove artificial single-thread constraint on quantized int8 k...David Mansell
2024-02-05Fix leftover cols in CpuGemmLowpMatrixBReductionKernelJonathan Deakin