Skip to content

mattscar/cuda-book-gnu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example code for CUDA 13: High-Performance Computing and Graphics

This repository contains the example projects for the book CUDA 13: High-Performance Computing and Graphics by Matt Scarpino.

The projects in the chapters are given as follows:

  • ch02_intro - Simple application to verify installation of the CUDA toolkit
  • ch03_add_arrays - Adds arrays of values together, demonstrates memory operations
  • ch03_check_properties - Reads processing capabilities of the graphics card
  • ch04_half_floats - Demonstrates usage of half-precision floating-point values
  • ch04_memory_spaces - Shows how different memory spaces can be accessed in CUDA
  • ch04_print_builtin - Displays values of CUDA's built-in variables
  • ch04_shared_sync - Demonstrates usage of shared memory and thread block synchronization
  • ch05_mapped_memory - Accelerates data access by taking advantage of mapped memory
  • ch05_stream_events - Shows how CUDA streams and events work together
  • ch05_tree_recursion - Uses dynamic parallelism to implement recursion
  • ch05_wmma_demo - Performs matrix multiplication and accumulation using the WMMA API
  • ch05_complex_dot - Performs a complex-valued dot product using cuBLAS
  • ch06_matrix_vec - Uses cuBLAS to multiply a matrix by a vector
  • ch07_dense_solver - Demonstrates how cuSOLVER can be used to solve a dense matrix system
  • ch07_sparse_solver - Shows how cuDSS can be used to solve a sparse matrix system
  • ch08_fft_ifft - Performs the forward FFT and inverse FFT using the cuFFT package
  • ch09_image_constrast - Updates an image's contrast using the NPPI package
  • ch09_image_filter - Performs image filtering using NPPI
  • ch09_rotate_resize - Rotates and resizes an image using NPPI
  • ch10_texture_squares - Applies texture to an OpenGL rendering
  • ch10_three_squares - Creates a simple OpenGL rendering
  • ch11_basic_interop - Demonstrates how CUDA and OpenGL can work together
  • ch11_spinning_sphere - Animates rendering using CUDA and OpenGL
  • ch12_gaussian_blur - Shows how CUDA and OpenGL can work together to filter textures

The example projects in the appendices are:

  • appa_add_arrays - Uses the CUDA Driver API to add two arrays
  • appa_device_attributes - Accesses device attributes using the CUDA Driver API
  • appa_runtime_compile - Demonstrates how NVRTC makes runtime compilation possible
  • appb_array_reverse - Reverses an array of values using Parallel Thread Execution (PTX)
  • appb_simple_ptx - Presents a simple application based on Parallel Thread Execution (PTX)

The repository contains a file named CMakeLists.txt and a folder for each project. If you've installed CMake and the CUDA Toolkit, the build process consists of five steps:

  1. Open a terminal and change to the top-level directory
  2. Create a build folder with mkdir build
  3. Enter the build folder with cd build
  4. Generate the makefiles with cmake ..
  5. Build each project with cmake --build .

The last step tells CMake to enter each subdirectory listed in CMakeLists.txt and build the source code into an executable. When this finishes, the build folder will contain a subfolder for each project. Each subfolder will contain the compiled executable that can be run from the command line.

An example will help make this clear. Once the ch03_add_arrays project is built, the build folder will contain a subfolder named ch03_add_arrays. If the build succeeded, the subfolder will contain an executable named ch03_add_arrays.

The last step of the process builds most of the book's projects, but not all of them. The ch07_sparse_solver project requires cuDSS to be installed and the projects in Chapters 10, 11, and 12 require OpenGL, GLUT, and GLEW. Chapter 7 explains how to install cuDSS and Chapter 10 explains how to install OpenGL, GLUT, and GLEW.

Once these additional packages have been installed, the projects can be added to the build by uncommenting lines in the top-level CMakeLists.txt. For example, once cuDSS is installed, the ch07_sparse_solver project can be added to the build by removing the pound sign (#) in the following line:

\#add_subdirectory(ch07_sparse_solver)

Similarly, when OpenGL, GLUT, and GLEW are available, the add_subdirectory lines for the ch10_texture_squares, ch10_three_squares, ch11_basic_interop, ch11_spinning_sphere, and ch12_gaussian_blur projects can be uncommented.

In addition to installing OpenGL, GLUT, and GLEW, many Linux systems need to be specifically directed to use the Nvidia device for graphics. On my system, I set the __NV_PRIME_RENDER_OFFLOAD environment variable to 1. I also set the __GLX_VENDOR_LIBRARY_NAME environment variable to nvidia. The second variable tells OpenGL to rely on Nvidia's OpenGL driver for graphical rendering.

About

Contains code for the book CUDA 13: High-Performance Computing and Graphics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors