Age | Commit message (Collapse) | Author |
|
- 256 and 512 configuration variants execute 1D convolutions
in an optimised manner compared to their 2x2 microblock
dimensions. This commit takes this into account to improve
Conv1D throughput on these configurations.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I6ecdf6e4a219e356327b22f8393f50ee8817af23
|
|
- Update block config selection to take into account partial
IFM fetches at edge of non-whole OFM block data.
- Change to scheduler depth slicing for networks in MLBEDSW-4637
for improved buffering. This helps general performance by buffering
larger depth slices.
- Bug fix for opt_max_schedule always being fitted to SRAM which
prevented the optimisation step running in some cases.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I97642c5adec3bb684b1daabf2b81574c27d4eef2
|
|
- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
|