In this document, we describe a method for deriving detailed, system-level quantitative models for predicting performance, power, and energy consumption based on source code software metrics.
The models are built by employing various statistical and machine learning methods where the predictors (independent variables) are extracted from the source code (using static analysis techniques) and the output of the models is an estimation of the gain (in terms of running time, average power, or energy) of executing a software element (a.k.a. kernel) on a specific accelerator (e.g., on a nowadays widespread GPU, or on an FPGA unit) instead of the CPU(s) of the host system, as well as the associated costs of transferring data necessary to run thekernel on the selected accelerator.
To build the desired prediction models, rst we took several algorithms referred to as benchmarks that were implemented both in sequential C++ and in the computing platform independent OpenCL C language (assisted by C/C++ host code). Then, we instrumented these benchmarks that allowed measuring energy consumption and analyzing statically the most promising regions of the source code. Moreover, we separated the dierent phases of the execution of an OpenCL application: initializations, data transfers and the kernels execution. After this, we extracted multiple code size, coupling and complexity metrics from the kernels via static code analysis, and aggregated them to system level for each benchmark. In parallel to the source code analysis, we also performed measurements in terms of performance, energy, and power required by each phase of these algorithms by executing them on dierent computing devices (most notably, on FPGA as well), using internal and external measurement methods. Finally, we applied several statistical and machine learning methods that use the combined static and dynamic information to build predictive models.
There are no prerequisites for using the created models; if they have to be applied to a previously unseen program, only source code analysis is needed and the specication of the problem size, as dened in this report, and the extracted metrics can be fed to one of the models to predict the best computing device to run the kernel and also to predict the expected gain.
In summary, we describe the method for creating predictive models through concrete experiments and based on two sets of benchmark programs. However, the modeling methodology is not specic for these benchmarks, it can be applied for and repeated on any alternative benchmark sets, should the need arise (e.g., either on larger benchmark sets, or on more domain specific ones). Our presented results validate the idea of using source code metrics for predicting runtime performance and power consumption for accelerator devices. With the selected metrics, some classification models reached 65%+ precision while several regression models showed 0.90+ correlation for various aspects and configurations. Finally, such individual models can be combined into a hierarchical full system prediction model.