aboutsummaryrefslogtreecommitdiff
path: root/docs/user_guide
diff options
context:
space:
mode:
authorNathan John Sircombe <nathan.sircombe@arm.com>2023-04-26 15:02:43 +0100
committernathan.sircombe <nathan.sircombe@arm.com>2023-05-02 14:24:17 +0000
commitd7113e4af5b5497d3a3a62dc9cf6b147e2a024cd (patch)
tree699742317f9befb3adf8be4222e13fa6cdd46f6b /docs/user_guide
parent7a0f1bdaf74cde263b2919c7d1652b0cb87a94f3 (diff)
downloadComputeLibrary-d7113e4af5b5497d3a3a62dc9cf6b147e2a024cd.tar.gz
Removes `experimental` from `experimental_fixed_format_kernels` flag
Renames `experimental_fixed_format_kernels` build option to `fixed_format_kernels`. Adds documentation for the flag covering basics: - What fixed-format kernels are - Why they're needed - Which backend they're for (i.e. CPU) - Some pointers on how to use them. Resolves: ONCPUML-1253 Change-Id: I428c98614c309c9ffc32d0f32daa24740f7cb967 Signed-off-by: Nathan John Sircombe <nathan.sircombe@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9523 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'docs/user_guide')
-rw-r--r--docs/user_guide/how_to_build_and_run_examples.dox14
1 files changed, 14 insertions, 0 deletions
diff --git a/docs/user_guide/how_to_build_and_run_examples.dox b/docs/user_guide/how_to_build_and_run_examples.dox
index 8aab445093..e0079cf42a 100644
--- a/docs/user_guide/how_to_build_and_run_examples.dox
+++ b/docs/user_guide/how_to_build_and_run_examples.dox
@@ -513,5 +513,19 @@ To build libraries, examples and tests:
cmake .. -DOPENMP=1 -DWERROR=0 -DDEBUG=0 -DBUILD_EXAMPLES=1 -DBUILD_TESTING=1 -DCMAKE_INSTALL_LIBDIR=.
cmake --build . -j32
+@section S1_8_fixed_format Building with support for fixed format kernels
+
+@subsection S1_8_1_intro_to_fixed_format_kernels What are fixed format kernels?
+
+The GEMM kernels used for convolutions and fully-connected layers in Compute Library employ memory layouts optimized for each kernel implementation. This then requires the supplied weights to be re-ordered into a buffer ready for consumption by the GEMM kernel. Where Compute Library is being called from a framework or library which implements operator caching, the re-ordering of the inputted weights into an intermediate buffer may no longer be desirable. When using a cached operator, the caller may wish to re-write the weights tensor, and re-run the operator using the updated weights. With the default GEMM kernels in Compute Library, the GEMM will be executed with the old weights, leading to incorrect results.
+
+To address this, Compute Library provides a set of GEMM kernels which use a common blocked memory format. These kernels consume the input weights directly from the weights buffer and do not execute an intermediate pre-transpose step. With this approach, it is the responsibility of the user (in this case the calling framework) to ensure that the weights are re-ordered into the required memory format. @ref NEGEMM::has_opt_impl is a static function that queries whether there exists fixed-format kernel, and if so will return in the expected weights format. The supported weight formats are enumerated in @ref arm_compute::WeightFormat.
+
+@subsection S1_8_2_building_fixed_format Building with fixed format kernels
+
+Fixed format kernels are only available for the CPU backend. To build Compute Library with fixed format kernels set fixed_format_kernels=1:
+
+ scons Werror=1 debug=0 neon=1 opencl=0 embed_kernels=0 os=linux multi_isa=1 build=native cppthreads=1 openmp=0 fixed_format_kernels=1
+
*/
} // namespace arm_compute