From 31441595009182c985dacbedc70c41ee6664d070 Mon Sep 17 00:00:00 2001
From: Ryan OShea <ryan.oshea3@arm.com>
Date: Mon, 7 Nov 2022 16:20:48 +0000
Subject: IVGCVSW-7214 Disable BF16-Turbo-Mode and remove conversion layers

 - Remove Bf16ToFp32 Conversion Layer
 - Remove Fp32ToBf16 Conversion Layer
 - Remove B16 Conversion tests
 * Throw exception if m_ReduceFp32ToBf16 optimzer option is set to true
 * Provide comments to enable fast math in order to use bf16
 * Update docs to inform users to enable fast math for bf16

 Execute Network Changes
 * Require bf16_turbo_mode to also have fast_math_enabled set to true
 - Remove setting m_ReduceFp32ToBf16 optimizer option

Signed-off-by: Ryan OShea <ryan.oshea3@arm.com>
Change-Id: Ibaa6da9d29c96a1ce32ff5196b0847fde9f04a1c
---
 docs/02_operator_list.dox     | 84 -------------------------------------------
 docs/05_05_runtimeoptions.dox |  2 +-
 2 files changed, 1 insertion(+), 85 deletions(-)

(limited to 'docs')
diff --git a/docs/02_operator_list.dox b/docs/02_operator_list.dox
index 3a902c8883..d9a3d2c83b 100644
--- a/docs/02_operator_list.dox
+++ b/docs/02_operator_list.dox
@@ -654,48 +654,6 @@ where N = batches, C = channels, H = height, W = width
     <tr><th>
     <tr><td>All
     </table>
-<tr>
-  <td rowspan="3">ConvertBf16ToFp32Layer
-  <td rowspan="3" style="width:200px;"> Layer to convert BFloat16 tensor to Float32 tensor.
-  <td rowspan="3">
-      <ul>
-       <li>N/A
-      </ul>
-   <td>CpuRef
-     <td>
-         <ul>
-          <li>All
-         </ul>
-     <td>
-      <table>
-         <tr><th>
-         <tr><td>BFLOAT16
-         <tr><td>FLOAT32
-      </table>
-<tr>
-  <td>CpuAcc
-  <td>
-      <ul>
-       <li>All
-      </ul>
-  <td>
-    <table>
-         <tr><th>
-         <tr><td>BFLOAT16
-         <tr><td>FLOAT32
-    </table>
-<tr>
-  <td>GpuAcc
-  <td>
-      <ul>
-       <li>All
-      </ul>
-  <td>
-    <table>
-         <tr><th>
-         <tr><td>BFLOAT16
-         <tr><td>FLOAT32
-    </table>
 <tr>
   <td rowspan="3">ConvertFp16ToFp32Layer
   <td rowspan="3" style="width:200px;"> Layer to convert Float16 tensor to Float32 tensor.
@@ -738,48 +696,6 @@ where N = batches, C = channels, H = height, W = width
          <tr><td>FLOAT16
          <tr><td>FLOAT32
     </table>
-<tr>
-  <td rowspan="3">ConvertFp32ToBf16Layer
-  <td rowspan="3" style="width:200px;"> Layer to convert Float32 tensor to BFloat16 tensor.
-  <td rowspan="3">
-      <ul>
-       <li>N/A
-      </ul>
-   <td>CpuRef
-     <td>
-         <ul>
-          <li>All
-         </ul>
-     <td>
-      <table>
-         <tr><th>
-         <tr><td>BFLOAT16
-         <tr><td>FLOAT32
-      </table>
-<tr>
-  <td>CpuAcc
-  <td>
-      <ul>
-       <li>All
-      </ul>
-  <td>
-    <table>
-         <tr><th>
-         <tr><td>BFLOAT16
-         <tr><td>FLOAT32
-    </table>
-<tr>
-  <td>GpuAcc
-  <td>
-      <ul>
-       <li>All
-      </ul>
-  <td>
-    <table>
-         <tr><th>
-         <tr><td>BFLOAT16
-         <tr><td>FLOAT32
-    </table>
 <tr>
   <td rowspan="3">ConvertFp32ToFp16Layer
   <td rowspan="3" style="width:200px;"> Layer to convert Float32 tensor to Float16 tensor.
diff --git a/docs/05_05_runtimeoptions.dox b/docs/05_05_runtimeoptions.dox
index 454d4af740..b5888eed60 100644
--- a/docs/05_05_runtimeoptions.dox
+++ b/docs/05_05_runtimeoptions.dox
@@ -81,7 +81,7 @@ OptimizerOptions are a set of parameters specifically targeting the Arm NN optim
 Arm NN Parameter | Delegate  | Support library | Values | Description
 :--------------- | :-------- | :-------------- | :----- | :----------
 reduceFp32ToFp16 | reduce-fp32-to-fp16 | (Not available) | ["true"/"false"] | Note This feature works best if all operators of the model are in Fp32. ArmNN will add conversion layers between layers that weren't in Fp32 in the first place or if the operator is not supported in Fp16. The overhead of these conversions can lead to a slower overall performance if too many conversions are required. 
-reduceFp32ToBf16 | reduce-fp32-to-bf16 | (Not available) | ["true"/"false"] | This feature works best if all operators of the model are in Fp32. ArmNN will add conversion layers between layers that weren't in Fp32 in the first place or if the operator is not supported in Bf16. The overhead of these conversions can lead to a slower overall performance if too many conversions are required.
+reduceFp32ToBf16 | reduce-fp32-to-bf16 | (Not available) | ["true"/"false"] | This feature has been replaced by enabling Fast Math in compute library backend options. This is currently a placeholder option
 debug            | debug-data | (Not available) | ["true"/"false"] | If the debug flag is set a DebugLayer is inserted after each layer. The action of each debug layer is backend specific.
 importEnabled | memory-import | (Not available) | ["true"/"false"] | Instructs the optimizer that this model will be importing it's input tensors. This value must match the MemorySource set for input in INetworkProperties.
 exportEnabled | (Not available) | (Not available) | ["true"/"false"] | Instructs the optimizer that this model will be exporting it's output tensors. This value must match the MemorySource set for output in INetworkProperties.
-- 
cgit v1.2.1