Example code for CUDA 13: High-Performance Computing and Graphics

This repository contains the example projects for the book CUDA 13: High-Performance Computing and Graphics by Matt Scarpino.

The projects in the chapters are given as follows:

ch02_intro - Simple application to verify installation of the CUDA toolkit
ch03_add_arrays - Adds arrays of values together, demonstrates memory operations
ch03_check_properties - Reads processing capabilities of the graphics card
ch04_half_floats - Demonstrates usage of half-precision floating-point values
ch04_memory_spaces - Shows how different memory spaces can be accessed in CUDA
ch04_print_builtin - Displays values of CUDA's built-in variables
ch04_shared_sync - Demonstrates usage of shared memory and thread block synchronization
ch05_mapped_memory - Accelerates data access by taking advantage of mapped memory
ch05_stream_events - Shows how CUDA streams and events work together
ch05_tree_recursion - Uses dynamic parallelism to implement recursion
ch05_wmma_demo - Performs matrix multiplication and accumulation using the WMMA API
ch05_complex_dot - Performs a complex-valued dot product using cuBLAS
ch06_matrix_vec - Uses cuBLAS to multiply a matrix by a vector
ch07_dense_solver - Demonstrates how cuSOLVER can be used to solve a dense matrix system
ch07_sparse_solver - Shows how cuDSS can be used to solve a sparse matrix system
ch08_fft_ifft - Performs the forward FFT and inverse FFT using the cuFFT package
ch09_image_constrast - Updates an image's contrast using the NPPI package
ch09_image_filter - Performs image filtering using NPPI
ch09_rotate_resize - Rotates and resizes an image using NPPI
ch10_texture_squares - Applies texture to an OpenGL rendering
ch10_three_squares - Creates a simple OpenGL rendering
ch11_basic_interop - Demonstrates how CUDA and OpenGL can work together
ch11_spinning_sphere - Animates rendering using CUDA and OpenGL
ch12_gaussian_blur - Shows how CUDA and OpenGL can work together to filter textures

The example projects in the appendices are:

appa_add_arrays - Uses the CUDA Driver API to add two arrays
appa_device_attributes - Accesses device attributes using the CUDA Driver API
appa_runtime_compile - Demonstrates how NVRTC makes runtime compilation possible
appb_array_reverse - Reverses an array of values using Parallel Thread Execution (PTX)
appb_simple_ptx - Presents a simple application based on Parallel Thread Execution (PTX)

The repository contains a file named CMakeLists.txt and a folder for each project. If you've installed CMake and the CUDA Toolkit, the build process consists of five steps:

Open a terminal and change to the top-level directory
Create a build folder with mkdir build
Enter the build folder with cd build
Generate the makefiles with cmake ..
Build each project with cmake --build .

The last step tells CMake to enter each subdirectory listed in CMakeLists.txt and build the source code into an executable. When this finishes, the build folder will contain a subfolder for each project. Each subfolder will contain the compiled executable that can be run from the command line.

An example will help make this clear. Once the ch03_add_arrays project is built, the build folder will contain a subfolder named ch03_add_arrays. If the build succeeded, the subfolder will contain an executable named ch03_add_arrays.

The last step of the process builds most of the book's projects, but not all of them. The ch07_sparse_solver project requires cuDSS to be installed and the projects in Chapters 10, 11, and 12 require OpenGL, GLUT, and GLEW. Chapter 7 explains how to install cuDSS and Chapter 10 explains how to install OpenGL, GLUT, and GLEW.

Once these additional packages have been installed, the projects can be added to the build by uncommenting lines in the top-level CMakeLists.txt. For example, once cuDSS is installed, the ch07_sparse_solver project can be added to the build by removing the pound sign (#) in the following line:

\#add_subdirectory(ch07_sparse_solver)

Similarly, when OpenGL, GLUT, and GLEW are available, the add_subdirectory lines for the ch10_texture_squares, ch10_three_squares, ch11_basic_interop, ch11_spinning_sphere, and ch12_gaussian_blur projects can be uncommented.

In addition to installing OpenGL, GLUT, and GLEW, many Linux systems need to be specifically directed to use the Nvidia device for graphics. On my system, I set the __NV_PRIME_RENDER_OFFLOAD environment variable to 1. I also set the __GLX_VENDOR_LIBRARY_NAME environment variable to nvidia. The second variable tells OpenGL to rely on Nvidia's OpenGL driver for graphical rendering.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Example code for CUDA 13: High-Performance Computing and Graphics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
appa_add_arrays		appa_add_arrays
appa_device_attributes		appa_device_attributes
appa_runtime_compile		appa_runtime_compile
appb_array_reverse		appb_array_reverse
appb_simple_ptx		appb_simple_ptx
ch02_intro		ch02_intro
ch03_add_arrays		ch03_add_arrays
ch03_check_properties		ch03_check_properties
ch04_half_floats		ch04_half_floats
ch04_memory_spaces		ch04_memory_spaces
ch04_print_builtin		ch04_print_builtin
ch04_shared_sync		ch04_shared_sync
ch05_mapped_memory		ch05_mapped_memory
ch05_stream_events		ch05_stream_events
ch05_tree_recursion		ch05_tree_recursion
ch05_wmma_demo		ch05_wmma_demo
ch06_complex_dot		ch06_complex_dot
ch06_matrix_vec		ch06_matrix_vec
ch07_dense_solver		ch07_dense_solver
ch07_sparse_solver		ch07_sparse_solver
ch08_fft_ifft		ch08_fft_ifft
ch09_image_contrast		ch09_image_contrast
ch09_image_filter		ch09_image_filter
ch09_rotate_resize		ch09_rotate_resize
ch10_texture_squares		ch10_texture_squares
ch10_three_squares		ch10_three_squares
ch11_basic_interop		ch11_basic_interop
ch11_spinning_sphere		ch11_spinning_sphere
ch12_gaussian_blur		ch12_gaussian_blur
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Example code for CUDA 13: High-Performance Computing and Graphics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages