Dynamic Optimization of On-Chip Memories for HLS Targeting Many-Accelerator Platforms
Abstract
Many-accelerator platforms have been introduced for maximizing FPGA's throughput. However, as the high saturation rate of the FPGA's on-chip memories limits the number of synthesized accelerators, frameworks for Dynamic Memory Management (DMM) that allow the synthesized designs to allocate/de-allocate on-chip memory resources during run-time have been suggested. Although, those frameworks manage to increase the accelerators' density by minimizing the utilized memory resources, the parallel execution of many-accelerators may cause severe memory fragmentation and thus memory allocation failures. In this work, a framework that optimizes the memory usage by performing memory defragmentation operations in HLS many-accelerator architectures that share on-chip memories is proposed. Experimental results highlight the effectiveness of the proposed solution to eliminate memory allocation failures due to memory fragmentation, reduce memory allocation failures up to 32% on average and decrease the memory size requirements up to 5% with controllable latency and resource utilization overhead.