|
OpenCL implementation uses built in erf.
NEON implementation requires new vectorized erf.
Uses the following approximation:
erf(x) = 1 - 1 / (1 + a1x + a2x^2 + a3x^3 + a4x^4)^4
a1 = 0.278393, a2 = 0.230389, a3 = 0.000972, a4 = 0.078108
From https://en.wikipedia.org/wiki/Error_function#Numerical_approximations
Signed-off-by: Murray Kornelsen <murray.kornelsen@mail.mcgill.ca>
Change-Id: I2d3964b2c26a4334166b17135f9104bc6324fad2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7921
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Pablo Marquez Tello <pablo.tello@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|