CS 179: GPU Programming

	GPU Programming CS 179 The use of Graphics Processing Units for rendering is well known, but their power for general parallel computation has only recently been explored. Parallel algorithms running on GPUs can often achieve up to 100x speedup over similar CPU algorithms, with many existing applications for physics simulations, signal processing, financial modeling, neural networks, and countless other fields. This course will cover programming techniques for the GPU. The course will introduce NVIDIA's parallel computing language, CUDA. Beyond covering the CUDA programming model and syntax, the course will also discuss GPU architecture, high performance computing on GPUs, parallel algorithms, CUDA libraries, and applications of GPU computing. Problem sets will cover performance optimization and specific GPU applications such as numerical mathematics, medical imaging, finance, and other fields. This quarter we will also cover uses of the GPU in Machine Learning. Labwork will require significant programming. A working knowledge of the C programming language will be necessary. Although CS 24 is not a prerequisite, it (or equivalent systems programming experience) is strongly recommended. 9 units; third term.

Instructors/TA's:	George Stathopoulos - gstathop@caltech.edu Mary Giambrone- mgiambro@caltech.edu Jenny Lee - clee7@caltech.edu Piazza Please ask through Piazza if you have a question/issue that likely affects other students. Piazza will be used to make announcements throughout the course. TA email: cs179tas@googlegroups.com. Use Piazza, generally, for questions on the assignments or the material. These may be of interest to other people. Send an email to the TA's if you have something that only affects you or your project group. Lab submission: IMPORTANT: updated for 2019 Instead of emailing your lab solution to the TA email, please put a zip file of your solution in your home directory on Titan, with the name lab[N]_2019_submission.zip . See Piazza for more details!
Supervising Professor:	Professor Al Barr - barradmin@cs.caltech.edu
Time and place:	MWF 3:00-3:55 PM Annenberg 107
Office Hours:	Based on the responses to the surveys, we have determined that the class time will remain from 3-4pm MWF, as originally scheduled. In addition, set due dates will remain 3pm on Wednesdays. 2:30-4:30pm Sunday (Mary) 7-9pm Monday (Jenny) 4-6pm Tuesday (George) 104 Annenberg, instructional laboratory. Note! As of May 15, office hours will be by appointment.
Grading policy:	Here is the grading scheme for the class: 6 labs (60% of grade) 4 week project (40% of grade) All labs will be scored out of 100 and are weighted equally (meaning each lab is worth 10% of your grade). The final project can be completed individually or as a pair. Homework extensions may be granted if the TA's see it as appropriate. E grades will not be granted except under extreme circumstances. Note! To pass the course, a "sufficient" number of assignments will need to be submitted and graded before Drop Day!!!
Lectures:	Week 1 (Introduction), MWF George Lecture 1 (Mon. 04/01): PPT PDF Lecture 2 (Wed. 04/03): PPT PDF Lecture 3 (Fri. 04/05): PPT PDF Week 2 (Shared Memory), MWF Jenny Lecture 4 (Mon. 04/08): PPT PDF Lecture 5 (Wed. 04/10): PPT PDF Lecture 6 (Fri. 04/12): PPT PDF Week 3 (Reductions, FFT) MWF George Lecture 7 (Mon. 04/15): PPT PDF Lecture 8 (Wed. 04/17): PPT PDF Lecture 9 (Fri. 04/19): PPT PDF Week 4 (cuBLAS and Graphics) MWF George Lecture 10 (Mon. 04/22): PPT PDF Google Doc Lecture 11 (Wed. 04/24): cuBLAS example Lecture 12 (Fri. 04/26): PPT PDF Week 5 (Machine Learning and cuDNN I) MWF Jenny Lecture 13 (Mon. 04/29): PPT PDF Lecture 14 (Wed. 05/01): PPT PDF Lecture 15 (Fri. 05/03): PPT PDF Week 6 (Machine Learning and cuDNN II) MWF Jenny Lecture 16 (Mon. 05/06): PPT PDF Lecture 17 (Wed. 05/08): PPT PDF Week 7 (Projects) MWF no class Week 8 (Projects) MWF no class Week 9 (Projects) MWF no class Week 10 (Projects) MWF no class
Assignments:	Lab 1: assignment text UNIX files (updated for 2019) Note: please follow the instructions in this text file instead of the one in the zipped folder Lab 2: assignment text UNIX files (updated for 2019) Lab 3: assignment text UNIX files (updated for 2019) Lab 4: assignment text UNIX files (updated for 2019) Finalized as of 4-27-19. Also, your project proposals will be due on Friday, May 3, so start working on those, and feel free to ask the TAs if you need guidance. Lab 5: assignment text UNIX files (updated for 2019)
Project	INFO
Textbook:	Programming Massively Parallel Processors (3rd Edtion) is recommended but not required. Amazon Link.
CUDA Installation	There are three GPU machines available in Annenberg 104, the CMS machine lab. You will need a CMS account to use them. Alternatively we can supply a bootable USB image if you wish, with CUDA preinstalled. Otherwise you can consider this Guide (updated 2019) DANGER! Especially for non-Windows machines, make Clone of whole computer system before attempting installation! Don't do this casually. You can easily lose your ability to log in and your entire laptop/desktop environment without this type of backup! The loss of a working computer environment can affect your other classes. With the clone backup, however, you should not lose too much time if there is a problem. The CMS machines or the bootable USB image will be a safer option. To do the full partition backup, a suggested cloning tool is Clonezilla, where you can use these Clonezilla instructions as a reminder. An excellent USB "burning" tool (for making a Clonezilla drive or the CUDA boot drive) is Rufus, although it requires a Windows environment to run it . Other cloning and burning tools are acceptable, if you have your own favorites. Finally, use this code to retrieve your hardware info after you setup CUDA.
Resources:	CUDA C Programming Guide List of NVIDIA GPUs Mapping from GPU name to Compute Capability
Material from previous year(s):	2015 2016 2017 2018
Less useful, but cool resources:	NVIDIA's Parallel Forall Blog Videos from last several years of NVIDIA's conference on CUDA How to Write Code the Compiler Can Actually Optimize (2015) Excellent CPU optimization manuals What Every Programmer Should Know About Memory GPU focused systems guide to deep learning