Accelerating Applications with CUDA

Description: Through a combination of lectures and practicals this course describes the development of CUDA programs for execution on NVIDIA GPUs, with emphasis on achieving good performance. No knowledge of CUDA is assumed. Topics covered in the lectures include: overview of GPU hardware, SIMT multithreading, the different kinds of memory, caching and shared memory, conditional code and warp divergence, overlapping execution with data traffic, multi-GPU programming and the Unified Address Space, performance tuning and techniques for bandwidth-constrained algorithms, atomics, availability of libraries and resources for further study.

Aimed at: Anyone interested in writing CUDA programs for NVIDIA GPUs.

Prerequisites: No prior knowledge of CUDA is required. It is assumed that attendees are comfortable with the basic notions of parallel programming: threads, breaking a problem up into independent pieces and race conditions (some familiarity with OpenMP or MPI should be sufficient). Attendees should be competent in programming in C and be familiar with working in a UNIX environment (i.e., you should be able to connect to a machine remotely, use basic UNIX commands, edit a source file and understand the elementary steps in compiling object files and creating executables).

Duration: 3 days.

After Course Attendees Will: Be able to develop CUDA programs for single or multi-GPU workstations, and further develop their skills by studying the CUDA example codes provided by NVIDIA.

Registration: To register for HECToR courses go to the booking form.