Mini-course on Performance Engineering: How to write code that runs fast

Application developers often do not know how efficient their code is and where the performance limit lies. Performance optimizations, if any, often are restricted to trial-and-error rewrites and performance sampling. Inefficient code, both serial and parallel, tends to scale well and hides the fact that running this code on an HPC cluster is a huge waste of computational resources.

This course covers key concepts on how to write highly efficient sequential and multithreaded code. For this, we use matrix-matrix multiplication as a model problem. Starting from a simple textbook implementation of matrix-matrix multiplication, we discuss CPU architectural features and how the code can be rewritten to benefit from these.

Requirements

Good knowledge of C (in particular pointers)
Basic knowledge of OpenMP
Knowledge of how to build using CMake and link against libraries

Tentative schedule

September 12혻
13:15-15:00 Presentation/Lecture
15:15-17:00 Hands-on session

September 13
8:15-10:00 Hands-on session
10:15-12:00 PresentationLecture
13:15-15:00 Hands-on session

For more information about the event and to register (the number of seats is limited), contact Eddie Wadbro:혻eddie.wadbro@kau.se

벎떨눈첵

Mini-course on Performance Engineering: How to write code that runs fast