This is a static page that lists all courseware - lecture topics and slides. Details of a specific run of the course (eg. evaluation pattern and assignments) are in Moodle or Google Classroom.

Please drop me a line in case you find a missing or broken link.

Pre-requisites


This is an elective course taught in the senior year undergraduate and graduate programs. It is expected that students are familiar with Computer Organisation and Computer Architecture.

Courseware


Contents Lecture Slides
Logistics and Overview Lecture
CPU Architecture Lecture
Parallel Computing Architecture Lecture
Parallel Programming Frameworks Lecture
First CUDA C Program Lecture
Profiling with CUDA events Lecture
Memory Tiling Lecture
Advanced Memory Tiling Lecture
Example - Histogram Computing Lecture
Example - Convolution Computation Lecture
Dynamic Shared Memory Lecture
Streams Lecture
Parallelisation Thinking Lecture
Debugging Lecture
GPU within PC Architecture Lecture
Dynamic Parallelisation - Thrust Lecture
Parallel Reduce Operation Lecture
Parallel Scan Operation Lecture
OpenACC Lecture
Summary Lecture