@@ -7,6 +7,8 @@ Terry Sun; Arch Linux, Intel i5-4670, GTX 750
77
88This project contains a simplied graphics pipeline implemented in CUDA.
99
10+ ![ ] ( renders/demo.gif )
11+
1012## Pipeline Overview
1113
12141 . Vertex shader: applies a model-view-projection transformation to each
@@ -16,51 +18,103 @@ This project contains a simplied graphics pipeline implemented in CUDA.
1618 primitives (triangles). Parallelized across primitives.
1719
18203 . Geometry shader: after primitives are assembled, the geometry shader
19- generates more (or fewer) triangles for each existing triangle, up to a fixed
20- N for each triangle. Examples of this are (fixed-number) tesselation and
21- backface culling (both implemented).
21+ performs additional primitive generation (or deletion), up to a fixed factor
22+ (4) per original primitive.
23+
24+ 1 . Backface culling: triangles that face away from the camera are removed.
25+ ` thrust::remove_if ` stream compaction is used to filter these out before
26+ rasterization occurs.
27+
28+ 2 . Tessellation shading/smoothing: Each triangle is subdivided into 4 smaller
29+ triangles, with interpolated normals.
2230
23- 3 . Rasterization: uses a scanline algorithm to determine which fragments are
31+ 4 . Rasterization: uses a scanline algorithm to determine which fragments are
2432 covered by a particular primitive, performs depth testing, and stores into a
2533 depth buffer. Uses an axis-aligned bounding box for optimization, barycentric
2634 coordinate checking to test coverage, and CUDA ` atomicMin ` to avoid race
2735 conditions when doing depth testing. Parallelized across primitives.
2836
29- 4 . Fragment shading: computes color of each pixel using Lambert (diffuse)
37+ 5 . Fragment shading: computes color of each pixel using Lambert (diffuse)
3038 shading. Interpolates normals within a triangle. Parallelized across
3139 fragments.
3240
41+ 6 . Copy to screen/frame buffer.
42+
3343## Features
3444
35- ### Geometry shader + backface culling
45+ ### Geometry shader
46+
47+ ** Backface culling** . Triangles which do not face the camera are removed before
48+ rasterization. Triangles are tested by computing the cross product between the
49+ edges (v0-v1, v0-v2); by convention, triangles which face away from the front of
50+ the model will be defined such that this cross product has a negative z
51+ component. ([ source] [ bfc-wiki ] )
52+
53+ [ bfc-wiki] : https://en.wikipedia.org/wiki/Back-face_culling )
54+
55+ Back-facing faces are stream compacted away. This improves the execution
56+ warp coherency because all threads going through the scanline function are
57+ guaranteed to draw to at least one pixel on the screen.
58+
59+ ![ ] ( renders/backface.gif )
60+
61+ (In this example the back direction is fixed relative to the model in order to
62+ demonstrate missing faces. In practice, backface culling would be invisible to
63+ the viewer.)
64+
65+ * Tessellation geometry shading* . A second geometry shader divides each triangle
66+ into 4 smaller triangles (see left, below). Three new vertices are generated
67+ from each existing triangle. The vertex transformation must be applied again to
68+ each of these vertices, thus blurring the pipeline stages. (I considered moving
69+ the entire vertex shader to within geometry shader, but made the optimization of
70+ splitting the vertex shader out and transforming the original vertices once.)
71+
72+ Below: the middle triangle of the 4 generated triangles is colored lightly to
73+ show the pattern of tessellation.
74+
75+ ![ ] ( renders/tri-subdiv.png )
76+ ![ ] ( renders/suzanne-subdiv.png )
3677
3778### Color interpolation
3879
3980For every point, its normal is interpolated from its relative distance from the
40813 vertices of its triangle. This is calculated using barycentric coordinates.
41- This normal is then used to calculate a Lambert (diffuse) shading, which is
42- smooth.
82+ This normal is then used to calculate a Lambert (diffuse) shading (plus a small
83+ amount of ambient lighting).
84+
85+ Comparison of non-interpolated and interpolated normals:
86+
87+ ![ ] ( renders/suzanne-nosmooth.png )
88+ ![ ] ( renders/suzanne-smooth.png )
89+
90+ Animation of a light moving across the screen:
91+
92+ ![ ] ( renders/light.gif )
4393
4494### Antialiasing
4595
46- TODO: image
96+ Four fragments are generated for every pixel, spaced evenly within the pixel.
97+ The parallelization for this process varies between stages. In some (fragment
98+ shader), a thread is launched for each fragment; however, in others (scanline),
99+ a single thread will handle four fragments in succession. Future work might be
100+ to do analysis on methods of parallelizing this multi-fragmented approach.
47101
48- Four fragments are generated for every pixel, spaced evenly within the pixel. In
49- general this is parallelized
102+ At the very end of the pipeline, as fragments are translated into colors for the
103+ frame buffer, the four pixels associated with a frame are averaged together.
50104
51105### Scissor test
52106
53107Clipping optimization. Define a ` glm::vec4(xmin, xmax, ymin, ymax) ` window in
54108which to render to the screen. When performing scanline rasterization algorithm,
55109this test discards data outside of this window.
56110
57- ## Internals
111+ ![ ] ( renders/suzanne-clipped.png )
58112
59- ![ ] ( renders/suzanne-red.png )
113+ ## Internals
60114
61115![ ] ( renders/suzanne-normals.png )
62116
63- # Bloopers
117+ ## Bloopers
64118
65119![ ] ( renders/cow-oops1.png )
66120
0 commit comments