This code accompanies the course Foundations of GPU Computing (coming soon). It implements a simple feed-forward neural network in C, using CUDA and cuBLAS to run on Nvidia GPUs. The code includes the forward and backward (gradient) passes, and an Adam optimizer for training.
The purpose of using C is to reinforce the foundational lessons of the course, forcing us to be explicit about each step: each memory allocation, each kernel call, each stream synchronization.
This code is open source software. It is licensed under the Apache License, Version 2.0 (the "License"); you may not use it except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.
The code requires:
- Linux.
- CUDA.
- An Nvidia GPU of the Pascal generation (circa 2016) or later. Specifically, the code makes use of unified virtual memory, and only from the Pascal generation forward can the hardware coherently handle managed memory and kernel execution concurrently.
To build, use:
make
To run, use:
./main
If successful, the output will be one line per epoch, reporting test loss and elapsed time. If the build fails, it may be necessary to modify the Makefile
for the system.
The file bikeshare.csv
provides the data set. Each row corresponds to one hour of the year 2019, and each column one of 23 features (e.g. weather and holiday information, rescaled) and, for the last column, the label (number of trips in the hour, normalized).
The data has been prepared from the following sources:
- Capital Bikeshare trips
- Capital Bikeshare station locations
- U.S. Local Climatological Data (LCD) LCD_documentation.pdf)
- NOAA Solar Geometry Calculator
The course is only concerned with GPU programming and not model performance, but this at least provides some realistic data to play with.