Visibility Driven Out-of-Core HLOD Rendering

Visibility Driven Out-of-Core
HLOD Rendering
Patrick Cozzi
The University of Pennsylvania
00000000 of 01010110

Project History
Procedurally generated model of Pompeii: ~1.4 billion polygons.
Image from [Mueller06]
00000001 of 01010110

Project History
Boeing 777 model: ~350 million polygons.
Image from http://graphics.cs.uni-sb.de/MassiveRT/boeing777.html
00000010 of 01010110

Contents
 Previous Work
 View Frustum and Occlusion Culling
 Hardware Occlusion Queries (HOQ)
 Level of Detail (LOD)
 Hierarchical Level of Detail (HLOD)
 Out-of-Core Rendering (OOC)
00000011 of 01010110

Contents Continued
 Implementation Work
 Vertex Clustering [Rossignac93]
 HLOD Tree Creation
 Primary Contribution: OOC Rendering
 Results
 Future Work
 Demos throughout
00000100 of 01010110

View Frustum Culling
 Can be slower than brute force. When?
culled
rendered
culled
culled
rendered
rendered
00000101 of 01010110

0
1
2
3
4
5
0 1
3 42
5
00000110 of 01010110

0
1
2
3
4
5
0 1
3 42
5
00000111 of 01010110

 Demo
00001000 of 01010110

Occlusion Culling
 Effective in scenes with high depth
complexity
culled
0001001 of 01010110

Occlusion Culling
 From-region or from-point
 Most are conservative
 Occluder Fusion
 Difficult for general scenes with
arbitrary occluders. So make
simplifying assumptions:
 [Wonka00] – urban environments
 [Ohlarik08] – planets and satellites
00001010 of 01010110

Hardware Occlusion Queries
 From-point visibility that handles
general scenes with arbitrary
occluders and occluder fusion
 How?
 Use the GPU
00001011 of 01010110

 Disable color and depth write
Color Buffer Depth Buffer
00001100 of 01010110

 Render BV using HOQ
00001101 of 01010110

 Enable color and depth writes
Color Buffer Depth Buffer
0001110 of 01010110

 Enable color and depth writes
 Render object based on HOQ
results
00001111 of 01010110

class IQueryOcclusion
{
public:
virtual void Begin() = 0;
virtual void End() = 0;
virtual bool IsResultAvailable() = 0;
virtual unsigned int NumberOfSamplesPassed() = 0;
virtual unsigned int NumberOfFragmentsPassed() = 0;
};
00010000 of 01010110

class IQueryOcclusion
{
public:
virtual void Begin() = 0;
virtual void End() = 0;
virtual bool IsResultAvailable() = 0;
virtual unsigned int NumberOfSamplesPassed() = 0;
virtual unsigned int NumberOfFragmentsPassed() = 0;
};
00001000 of 01010110

 CPU stalls and GPU starvation
Draw o1 Draw o2 Draw o3
Draw o1 Draw o2 Draw o3
CPU
GPU
Query o1
Query o1
Draw o1
Draw o1
-- stall --
-- starve --
CPU
GPU
00010001 of 01010110

Is Culling Enough?
00010010 of 01010110

Is Culling Enough?
Now what?
0001011 of 01010110

Is Culling Enough?
 Demo
00010100 of 01010110

Level of Detail
 Generation: less triangles, simpler
shader
 Selection: distance, pixel size
 Switching: avoid popping
 Discrete, Continuous, Hierarchical
00010101 of 01010110

Discrete LOD
3,086 Triangles 52,375 Triangles 69,541 Triangles
00010110 of 01010110

Discrete LOD
 Demo
00010111 of 01010110

Discrete LOD
Not enough detail
up close
Too much detail
in the distance
00011000 of 01010110

Continuous LOD
edge collapse
vertex split
Image from [Luebke01]
00011001 of 01010110

Hierarchical LOD
1 Node
3,086 Triangles
4 Nodes
9,421 Triangles
16 Nodes
77,097 Triangles
00011010 of 01010110

Hierarchical LOD
1 Node
3,086 Triangles
4 Nodes
9,421 Triangles
16 Nodes
77,097 Triangles
00011011 of 01010110

Hierarchical LOD
visit(node)
{
if (computeSSE(node) < pixel tolerance)
{
render(node);
}
else
{
foreach (child in node.children)
visit(child);
}
}
Node
Refinement
00011100 of 01010110

Hierarchical LOD
00011101 of 01010110

Hierarchical LOD
 New Problem: Cracks
00011110 of 01010110

Hierarchical LOD
 Demo
00011111 of 01010110

HLOD + Culling
visit(node)
{
if (node overlaps view frustum)
{
// ...
}
}
00100000 of 01010110

HLOD + Culling
visit(node)
{
{
render node’s BV with HOQ
if (query.NumberOfFragmentsPassed() > 0)
{
// ...
}
}
}
Render front to back!
00100001 of 01010110

HLOD + Culling + VMSSE
visit(node)
{
{
render node’s BV with HOQ
if (query.NumberOfFragmentsPassed() > 0)
{
if (computeVMSSE(node, query) < tolerance)
{
render(node);
}
else
{
// ...
}
}
}
}
00100010 of 01010110

 VMSEE: Virtual Multiresolution SSE
 Relative Visibility =
# pixels visible /
# possible pixels visible
 VMSSE = f(SSE, Relative Visibility)
VMSSE
00100011 of 01010110

Optimized HLOD Refinement Driven by
HOQs [Charalambos07]
 Exploit spatial and temporal
coherence for scheduling HOQs.
 Predict refinement based on node’s
relative visibility from previous
frame
 VMSSEi
est
= SSEi * biasi-1
00100100 of 01010110

Optimized HLOD Refinement Driven by
HOQs [Charalambos07]
 Example prediction
 Refinement stopped for this node in
previous frame
 VMSSEi
est
< threshold ? Stop : Refine
 Stop:
 Issue query
 Render without checking query
00100101 of 01010110

Implementation Work
 3 HLOD algorithms including
[Charalambos07]
 Vertex Clustering
 HLOD Tree Creation
 OOC Rendering
 Load/Unload Rules
 Rendering
 Replacement Policy
 Multithreading
00100110 of 01010110

Vertex Clustering [Rossignac93]
 Fast: expected O(n)
 Robustness: arbitrary topology
 Capable of drastic simplification
 “Easy to code”
 OOC extensions [Lindstrom00]
00100111 of 01010110

1. Compute per-vertex weights
11
0.8
0.50.5
2. Assign vertices to clusters
3. Identify highest weighted
vertex in each cluster
00100111 of 01010110

1. Compute per-vertex weights
11
0.8
2. Assign vertices to clusters
3. Identify highest weighted
vertex in each cluster
4. Collapse and remove
degenerate triangles
00101000 of 01010110

3,086 Triangles 52,375 Triangles 69,541 Triangles
00101001 of 01010110

 Questionable Fidelity
 Hard to control output
 Conservative Error Metric
00101010 of 01010110

HLOD Tree Creation
 Input
 Model (.ply, .obj)
 Target triangles per leaf node
 Maximum tree depth
 Output
 1 file per node
 Normals computed at runtime
00101011 of 01010110

HLOD Tree Creation
 Top-down
 Root node:
Full AABB
Lowest Detail
00101100 of 01010110

HLOD Tree Creation
 Splitting Planes
2 Planes 3 Planes
00101101 of 01010110

HLOD Tree Creation
 Splitting Planes
00101110 of 01010110

HLOD Tree Creation
00101111 of 01010110

visit(node)
{
if ((computeSSE(node) < pixel tolerance) ||
(not all children resident))
{
render(node);
requestResidency(child);
}
else
{
visit(child);
}
}
Previous Work: Out-of-Core
Based on [Ulrich02]
Prefetch
Need all
children
To render
To refine
00110000 of 01010110

 [Varadhan02]
 Requires full skeleton in memory
 No occlusion culling
 No front-to-back sorting
Image From [Varadhan02]
00110001 of 01010110

 [Corrêa03]
 PLP in separate thread
 Requires full skeleton in memory
 No LOD
00110010 of 01010110

Out-of-Core
 Replacement Policy?
 LRU?
 Can’t refine when one child is removed
 Remove deepest child in parent’s tree?
00110011 of 01010110

OOC Rendering
 Benefits of our algorithm
 No full HLOD skeleton
 Works with HOQs
 Refinement with a subset of children
 Replacement policy maximizes detail
near the viewer
 Multithreaded
00110100 of 01010110

OOC Rendering: Load/Unload
 HLOD tree on disk
00110101 of 01010110

 Subset of HLOD tree in memory
00110110 of 01010110

 Load node -> load children
skeletons
00110111 of 01010110

 Only unload dynamic leafs
00111000 of 01010110

 Only unload dynamic leafs
00111001 of 01010110

 Nodes don’t need all their children
in memory
00111010 of 01010110

 Result:
 If a node is not a skeleton, none of its
ancestors are skeletons. In other
words, if a node has geometry loaded,
so does all of its ancestors.
00111011 of 01010110

 Never Happens:
00111100 of 01010110

OOC Rendering: Rendering
 Modify in-core HLOD
 Add request queue:
 Stop refinement at skeleton node
 Push node onto request queue
 Ensure parent safety
 Render subset of parent’s geometry
00111101 of 01010110

OCC Rendering: Subset of Parent
 Use OpenGL clipping planes
00111110 of 01010110

 Without clipping planes
00111111 of 01010110

 Demo
01000000 of 01010110

OCC Rendering: Node Replacement
 Replacement List (only dynamic leafs)
01000001 of 01010110

 Replacement List Partitions
01000010 of 01010110

 Start Frame
01000011 of 01010110

 Add Node
01000100 of 01010110

 Render Node
01000101 of 01010110

 Move to safety
01000110 of 01010110

 Suggest Removal Node
01000111 of 01010110

OCC Rendering: Multithreading
01001000 of 01010110

Low Memory
 Demo
01001000 of 01010110

Selected Results (lol)
 Load Time
 10 Blocks in Pompeii
 5,646,041 triangles
Time in seconds
Full model 5.2
Out-of-Core 0.05
01001010 of 01010110

Selected Results
View 1 View 2
 Zoomed out rendering
01001011 of 01010110

Selected Results
View 1 View 2
Brute Force 63 fps
5,646,041 triangles
63 fps
5,646,041 triangles
HLOD - SSE 1,415 fps
161,742 triangles
881 fps
302,337 triangles
HLOD - Naive VMSEE 1,060 fps
140, 458 triangles
300 fps
260,007 triangles
HLOD - Scheduled
VMSSE
1,176 fps
140, 458 triangles
588 fps
270,774 triangles
 Zoomed out rendering
01001100 of 01010110

Selected Results
 Zoomed In Rendering
View 3 View 4
01001101 of 01010110

Selected Results
View 3 View 4
01001110 of 01010110

Selected Results
View 3 View 4
Brute Force 62 fps
5,646,041 triangles
62 fps
5,646,041 triangles
HLOD - SSE 128 fps
2,541,434 triangles
98 fps
3,222,701 triangles
HLOD - Naive
VMSEE
180 fps
346,901 triangles
320 fps
46,765 triangles
HLOD - Scheduled
VMSSE
210 fps
601,730 triangles
232 fps
103,844 triangles
01001111 of 01010110

Statistics
 Lines of Code
 GUI: 420
 Unit Tests: 1,720
 HLOD Creation: 4,600
 Rendering: 4,500
 Time Spent
 Coding: 8 weeks “fulltime.” 3 last
spring, 5 this fall.
 Plus reading, writing, slides, and
logistics.
01010000 of 01010110

Future Work
 Improve tree creation
 Polygonal simplification
 Splitting planes
 Fill cracks
 Optimal disk layout
 Better occlusion performance
 Multiple volumes or occlusion-
preserving low LOD
 Optimize use of clipping planes
01010001 of 01010110

Future Work
 Don’t require ancestors to have
geometry loaded.
 Much better use of memory
 More complicated rendering
 More rendering artifacts
01010010 of 01010110

Future Work
 Cache Management
 Aggressively remove nodes
 Replacement Policy: Average detail
instead of best up close
01010011 of 01010110

Future Work
 Multithreading
 Multiple load threads
 Fault tolerance, increase throughput
 Compute thread(s)
 Compute normals
 Decompress (/ recompress)
 Vertex cache optimize?
01010100 of 01010110

Future Work
 True Usefulness
 Textures
 Picking on individual objects
 Test with truly massive models
01010101 of 01010110

Future Work
 Today
 Mad Mex Hour Happy. Now – 6:30pm
 Saturday, February 7th
 Graduation Party. My House. 3pm.
01010110 of 01010110

Visibility Driven Out-of-Core HLOD Rendering

Recomendados

Recomendados

Más contenido relacionado

Último

Último (20)

Destacado

Destacado (20)

Visibility Driven Out-of-Core HLOD Rendering

Notas del editor