The REPARA Project aims to deploy software kernels of a sequential application written in C++ in a parallel heterogeneous platform by using static or dynamic scheduling and mapping techniques with the objective to improve both the performance and the energy efficiency. The run-time system plays an important role for both performance and energy efficient execution of kernels. In the REPARA project the selected run-time is the FastFlow parallel framework that has been extended to support a uniform kernel execution model across GPUs, FPGAs and DSPs.
An overall description of the FastFlow framework can be found in the REPARA deliverable D6.1 (Static runtimes for coordination in heterogeneous platforms). Instead, the deliverable D6.2 (Dynamic runtimes for heterogeneous platforms) describes how the FastFlow framework can be used for the dynamic selection of multiple targets devices, and how REPARA HW accelerators can be used for ne grain parallel execution of REPARA kernels. The FastFlow runtime uses OpenCL to target GPU devices and the ThreadPoolComposer API layer (developed in the Task T5.1 ) to target both FPGA and DSP devices.
In this deliverable we focus on the description of those mechanisms that support the implementation of the kernel execution migration across dierent target devices and the collection of run-time information metrics (such as energy consumption, memory consumption and execution time) that eventually are provided to the Dynamic Partitioning Engine (DPE) for making decisions about where and when given kernels of the application have to be executed. The DPE is part of the partitioning framework and represents the bridge between the low level REPARA run-time and the high-level tools where kernel scheduling decisions are made. In this deliverable, in particular, we describe the communication protocol between the DPE and
the FastFlow parallel patterns and how commands received from the DPE are actuated by the pattern run-time.