diff options
author | Narumol Prangnawarat <narumol.prangnawarat@arm.com> | 2022-01-28 17:59:18 +0000 |
---|---|---|
committer | Jim Flynn <jim.flynn@arm.com> | 2022-01-31 12:53:51 +0000 |
commit | e2af6f4322a1e2b8b3c391fb721a6a80c281477f (patch) | |
tree | b0dd53289e27304a6d724821459cb0f4b6343a39 /src/backends/cl/ClImportTensorHandle.hpp | |
parent | fd313fef775ed210f8dab84452ea382a0b4164b0 (diff) | |
download | armnn-e2af6f4322a1e2b8b3c391fb721a6a80c281477f.tar.gz |
IVGCVSW-6552 Add support of aligned host memory
* Add AllocatedData functions to OutputHandler
* Enable import aligned memory in ImportInputs
* Enable import aligned memory in ImportOutputs
* Allow to import input and output if the memory is aligned
* Implement Reconfigure function on ClConvolution2dWorkload
* End-to-end test on Ref and Cl to ensure that input and output memory
are imported when aligned
Signed-off-by: Narumol Prangnawarat <narumol.prangnawarat@arm.com>
Change-Id: I9e5e4c26d1ac2f1d806803ade5f64c6479c51718
Diffstat (limited to 'src/backends/cl/ClImportTensorHandle.hpp')
-rw-r--r-- | src/backends/cl/ClImportTensorHandle.hpp | 12 |
1 files changed, 10 insertions, 2 deletions
diff --git a/src/backends/cl/ClImportTensorHandle.hpp b/src/backends/cl/ClImportTensorHandle.hpp index a236a70d7c..54710d8135 100644 --- a/src/backends/cl/ClImportTensorHandle.hpp +++ b/src/backends/cl/ClImportTensorHandle.hpp @@ -205,7 +205,11 @@ public: // We do this to match the behaviour of the Import function later on. auto cachelineAlignment = arm_compute::CLKernelLibrary::get().get_device().getInfo<CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE>(); - auto roundedSize = cachelineAlignment + totalBytes - (totalBytes % cachelineAlignment); + auto roundedSize = totalBytes; + if (totalBytes % cachelineAlignment != 0) + { + roundedSize = cachelineAlignment + totalBytes - (totalBytes % cachelineAlignment); + } cl_int error = CL_SUCCESS; cl_mem buffer; @@ -252,7 +256,11 @@ private: // This does not change the size of the buffer, only the size of the mapping the buffer is mapped to auto cachelineAlignment = arm_compute::CLKernelLibrary::get().get_device().getInfo<CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE>(); - auto roundedSize = cachelineAlignment + totalBytes - (totalBytes % cachelineAlignment); + auto roundedSize = totalBytes; + if (totalBytes % cachelineAlignment != 0) + { + roundedSize = cachelineAlignment + totalBytes - (totalBytes % cachelineAlignment); + } cl_int error = CL_SUCCESS; cl_mem buffer; |