Age | Commit message (Collapse) | Author |
|
Improved block size selection by favouring larger
block sizes for elementwise operations.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I5b30b358d84fcd672935b863c2154bd8f4ccd928
|
|
This reverts commit d2b5510697e7789f5a416f9d80d3cb640eecc092.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ia3043bc9c27fe2f72f3ab2f6f7341b3a9adb4231
|
|
Add mypy to pre-commit and clean up all reported errors.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: If7dc869f5fecdb0e2db40f14e7d9db21aa33df71
|
|
- The number of accumulators is doubled in an Ethos-U configuration with
2 cores
- Likewise, for elementwise, depthwise and pooling operations
the IFM buffer depth capacity is doubled
- FindBlock: step the search space depth in multiples of ublock * ncores
Change-Id: I923cc347a2f252876d405ed93095d39181103f81
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
* 1D optimised block_config was incorrectly beign set to the ArchitectureBlockConfig in try_block_config()
* Write external API test for the reduced block height case (on H256)
Signed-off-by: James Ward <james.ward@arm.com>
Change-Id: I9ced7eb31b23730e4423aabbaf769bc72fac8fc9
|
|
Fixed output diff for some architectures due to incorrect IFM buffer size
calculation when using NearestNeighbour upscaling.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I0d6d1efc606603cdd6188ae282e7f6babfd7e24e
|
|
Reinstated the v2.1.0 functionality for find_block_configs(). This is
used exclusively by the external API.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I6977f13866957edb083769658cc8c57c2b3556fb
|
|
Deep speech was exhibiting poor performance in its first three
layers due to poor SHRAM utilisation.
- Given a choice between multiple identical-cost block configs,
the allocator was choosing the first one it encountered. This
commit biases the choice towards blocks with a larger IFM
fetch area to improve SHRAM utilisation.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I2ff18a13444b8812cb451a606ff692bf290e7d20
|
|
- 256 and 512 configuration variants execute 1D convolutions
in an optimised manner compared to their 2x2 microblock
dimensions. This commit takes this into account to improve
Conv1D throughput on these configurations.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I6ecdf6e4a219e356327b22f8393f50ee8817af23
|
|
- Update block config selection to take into account partial
IFM fetches at edge of non-whole OFM block data.
- Change to scheduler depth slicing for networks in MLBEDSW-4637
for improved buffering. This helps general performance by buffering
larger depth slices.
- Bug fix for opt_max_schedule always being fitted to SRAM which
prevented the optimisation step running in some cases.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I97642c5adec3bb684b1daabf2b81574c27d4eef2
|
|
- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
|