Mini-course on Performance Engineering: How to write code that runs fast
Application developers often do not know how efficient their code is and where the performance limit lies. Performance optimizations, if any, often are restricted to trial-and-error rewrites and performance sampling. Inefficient code, both serial and parallel, tends to scale well and hides the fact that running this code on an HPC cluster is a huge waste of computational resources.
This course covers key concepts on how to write highly efficient sequential and multithreaded code. For this, we use matrix-matrix multiplication as a model problem. Starting from a simple textbook implementation of matrix-matrix multiplication, we discuss CPU architectural features and how the code can be rewritten to benefit from these.
Requirements
Good knowledge of C (in particular pointers)
Basic knowledge of OpenMP
Knowledge of how to build using CMake and link against libraries
Tentative schedule
September 12聽
13:15-15:00 Presentation/Lecture
15:15-17:00 Hands-on session
September 13
8:15-10:00 Hands-on session
10:15-12:00 PresentationLecture
13:15-15:00 Hands-on session
For more information about the event and to register (the number of seats is limited), contact Eddie Wadbro:聽eddie.wadbro@kau.se