SlideShare una empresa de Scribd logo
1 de 114
Descargar para leer sin conexión
Video Game Optimization
Workshop
Amir H. Fassihi
Fanafzar Game Studio
Aug 2012
Fanafzar Game Studio
System Design
Requirements
•  Functional
•  Non Functional
Fanafzar Game Studio
Non Functional
Requirements
•  Maintainability
•  Extensibility
•  Security
•  Scalability
•  Intellectual Manageability
•  Availability
•  Portability
•  Usability
•  Performance
Fanafzar Game Studio
Performance
The amount of work accomplished by a
computer system compared to the time and
resources used.
•  Short response time
•  High throughput
•  Low utilization of computer resources
•  High availability of applications
•  Fast data compression and decompression
•  High bandwidth/ Short data transmission time
Fanafzar Game Studio
Video Games
•  Most x-abilities are important
– Even more so for game engines. (As in
enterprise applications)
•  Performance is REALLY important!
– For any game or game engine.
Fanafzar Game Studio
System Design
•  Solution for Functional Requirements
•  Solution for Non-Functional Requirements
– Bulk of the technical efforts
– Conflicts in Design!
– Performance as the bad boy in the group
– Performance as the cream of the crop
– Performance being directly experienced by
end user
Fanafzar Game Studio
Can you make this?
Fanafzar Game Studio
Optimization
•  “The process of modifying a software
system to make some aspects of it work
more efficiently or use fewer resources.”
Fanafzar Game Studio
Optimization Lifecycle
1.  Benchmark
2.  Detect (Hotspots and Bottlenecks)
3.  Solve
4.  Check
5.  Goto 1
Fanafzar Game Studio
Levels of Optimization
•  System Level
•  Algorithmic Level
•  Micro Level
– Branch prediction
– Instruction throughput
– Latency
Fanafzar Game Studio
Project Lifecycle and
Optimization
•  Pre-production
•  Production
•  Post-production
Optimization from High Level to Low Level
Quake Story: High level architectural
optimization before low level triangle draw
function (Carmack and Abrash)
http://www.bluesnews.com/abrash/
Fanafzar Game Studio
Measuring Performance in
Games
1.  Set Specification
1.  Performance Goal (FPS, time)
2.  Hardware Specification
2.  Define Line Items
1.  CPU time, RAM, GPU time, Video Mem
2.  Rendering, Physics, Sound, Gameplay, Misc.
Fanafzar Game Studio
Memory Management (God
of War)
32 Meg
memory
16 Meg for Levels, split into 24*1 Meg
Enemies
1.5
Meg
Exe
Run
Time
Data
Perm
Data
•  Establish Hard Rules.
–  16 Meg for Level Data (Split into 2 Levels)
–  4 * 1 Meg for Enemies
•  Maintain 60fps
From: Tim Moss 2006 GDC Talk
Fanafzar Game Studio
Tools
•  Profilers (Intel VTune, VS Profiler, …)
– Total time
– Self time
– Calls
•  System Monitors (Nvidia PerfHud, MS PIX,…)
•  System Adjusters (Intel GPA, …)
Fanafzar Game Studio
Holistic Optimization
•  Optimization Process
•  CPU Bound
•  GPU Bound
Fanafzar Game Studio
CPU Bound, Memory
•  Prefetching Memory
•  Memory Cache
Fanafzar Game Studio
Memory Optimization
•  Cache Miss
– Instruction Cache
– Data Cache
Fanafzar Game Studio
Memory Hierarchy
source: Memory Optimization, Christer Ericson, GDC 2003Fanafzar Game Studio
Data Access Patterns
•  Linear Access Forward
for (i = 0; i < numData; ++i)
memArray[i];
•  Linear Access Backward
Fanafzar Game Studio
Data Access Patterns Ctd.
•  Periodic Access
struct vertex
{
float pos[3];
float norm[3];
float textCoord[3];
}
for (i = 0; i < num; ++i)
vertexArray[i].pos
•  Random Access
Fanafzar Game Studio
AOS vs. SOA
Fanafzar Game Studio
Critical Stride
•  Stride size in memory read can cause
cache thrashing
Fanafzar Game Studio
Strip Mining
for
{ access pos;
}
for
{
access norm;
}
------------------------------------------------------
for
{
access pos;
access norm;
}
Fanafzar Game Studio
Memory
•  Stack
– Temporal coherence, spatial locality
•  Global
– No fragmentation, freed at end
•  Heap
– new, delete, malloc, free
– No spatial locality, no temporal coherence,
fragmentation
Fanafzar Game Studio
Load-Hit-Store
•  Write data to address x and then read the
data from address x -> Large stall
•  Writing data all the way to the main
memory through all caches -> 40 to 80
CPU cycle delay
•  http://assemblyrequired.crashworks.org/
2008/07/08/load-hit-stores-and-the-
__restrict-keyword/
Fanafzar Game Studio
Load-Hit-Store
Fanafzar Game Studio
Memory Solutions
•  Don’t allocate
•  Linearize allocations
– Use arrays
•  Memory pools
– Coherent
– No fragmentation
– No construction/destruction
•  Don’t construct or destruct
– Plain Old Structures (POS)
Fanafzar Game Studio
Memory Solutions
•  Time scoped pools
– Frame allocator
– Pool for one level content, discarded at the
end
Fanafzar Game Studio
Memory Manager
“If you don’t have a custom memory
manager in your game, you’re a fool (or a
PC game developer)”
Christer Ericson, Director of Tools and
Technology, Sony Santa Monica
Fanafzar Game Studio
Memory Related Solutions
•  Reducing memory footprint at compile time and
runtime
•  Algorithms that reduce memory fetching
•  Reduce cache miss
–  Spatial Locality
–  Proper Stride
–  Correct Alignment
•  Increase Temporal Coherence
•  Utilize Pre-fetching
•  Avoid worst-case access patterns that break
caching
Fanafzar Game Studio
Pitfalls of Object Oriented
Programming
Summary of study (Tony Albrecht, 2009)
•  Case study for CPU side rendering code
•  Just re-organizing data locations was a win
•  + pre-fetching is more win
•  Can you decouple data from objects?
•  Be aware of what the compiler and hardware
are doing, watch the generated assembly!
Fanafzar Game Studio
Pitfalls of OOP
•  Optimize for data first, then code
– Memory access is going to be your biggest
bottleneck
•  Simplify Systems
– KISS
– Easier to optimize, Easier to parallelize
•  Keep code and data homogeneous
•  Not everything needs to be an object
Fanafzar Game Studio
Pitfalls of OOP
•  You are writing a game
– You have control over the input data
– Don’t be afraid to pre-format it if needed
•  Design for specifics, not generics
Fanafzar Game Studio
Data Oriented Design
•  Better performance
•  Better realization of code optimization
•  Often simpler code
•  More parallelizable code
Fanafzar Game Studio
CPU Bound: Compute
•  Lots of arithmetic operations not load and
store
Fanafzar Game Studio
CPU Compute: Solutions
•  Compiler flags (float: precise/fast)
•  Time against Space
–  Use of lookup tables
•  Memoization
•  Function Inlining
•  Branch prediction, out of order execution
–  Branch mis-prediction is much less costly than
cache miss
•  Make branches more predictable
Fanafzar Game Studio
CPU Computer: Solutions
•  Remove Branches
– If (a) z=c; else z=d;
– Z = a * c + (1 – a) * d
•  Profile Guided Optimization
•  Loop unrolling
Fanafzar Game Studio
Loop Unrolling
for (i = 0; i < 100; ++i)
sum += intArray[i];
------------------------------------------------------
for (i = 0; i < 100; i+=4)
{
sum1 += intArray[i];
sum2 += intArray[i+1];
sum3 += intArray[i+2];
sum4 += intArray[i+3];
}
sum = sum1+sum2+sum3+sum4;
Fanafzar Game Studio
Virtual Functions
•  How slow are virtual functions really?
http://assemblyrequired.crashworks.org/2009/01/19/how-slow-are-
virtual-functions-really/
•  1000 iterations over 1024 vectors
•  12,288,000 function calls
•  Virtual: 159.856 ms
•  Direct: 67.962
•  Inline: 8.040 ms
Fanafzar Game Studio
Slow Virtual Functions
•  Problem is not the cost of looking up the
indirect function pointer from vtable.
•  The issue lies in “branch prediction” and
the way marshalling parameters for the
calling convention can get in the way of
good instruction scheduling.
Fanafzar Game Studio
Micro Optimization
•  Bit Tricks
–  Bitwise Swap
•  X^=Y; Y^=X; X^=Y;
–  Bitmasks
•  isFlagSet = someInt & MY_FLAG, someInt |= Flag2;
•  Example use: Collisions in Physics
–  Fast Modulo
•  X%Y = X & (Y -1) iff Y is a power of 2
–  Even and Odd
•  (X & 1) == 0; // same as X%2==0
Fanafzar Game Studio
Book on Bit Tricks
•  Hacker’s Delight (Henry S. Warren,
Addison Wesley, 2003)
Fanafzar Game Studio
Other Micro Optimization
•  Data type conversion
•  SSE Instructions
•  Removing loop invariant code
•  Loop unrolling
•  Cross-.obj optimization
– Whole program optimization
•  Hardware Specific Optimizations
Fanafzar Game Studio
Vector vs. List
•  Random data insertion and deletion into a
c++ vector and list compared
•  Data kept sorted in the containers
Fanafzar Game Studio
Vector vs. List Results
Fanafzar Game Studio
Vector vs. List Ctd.
Fanafzar Game Studio
STL iterator debugging
STL Iterator Debugging and Secure SCL
http://channel9.msdn.com/Shows/Going
+Deep/STL-Iterator-Debugging-and-Secure-
SCL
Fanafzar Game Studio
Copy vs. Move
•  Vector of strings with 4 dimensions
•  100 x 100 x 100 x 500
•  Construction: 564 ms
•  Copy Construction: 537 ms
•  Move Construction: 0.001 ms
•  Empty Destruction: 0.001 ms
•  Destruction: 285 ms
Fanafzar Game Studio
GPU Bound
•  GPU related issues
–  Synchronization
–  Capabilities Management
–  Resource Management
–  Global Ordering
•  Reflections/Shadows before scene
•  Opaque front to back/Translucent back to front
•  Sort by material or texture to reduce state changes
–  Instrumentation
–  Debugging
Fanafzar Game Studio
GPU Optimization Tricks
•  State Changes
•  Draw Call (Most common issue)
•  Instancing and Batching
–  Shader Instancing
–  Hardware Instancing
•  Video RAM
–  Device Resets
–  Resource uploads/locks
•  Minimize Copies
•  Minimize Locks
•  Double Buffer
Fanafzar Game Studio
GPU Optimization Ctd.
•  Fragmentation
– Power of 2 allocations help
•  Lock culling
– Debug visualization for those culled
•  Texture debugging
– Different texture for each mip level
Fanafzar Game Studio
GPU Bound?
•  Spend a long time in API calls (Draw calls
or swap/present frame buffer)
•  Front End / Back End
– Triangles/Geometry – Pixels/Shaders
– Vary each workload and measure
performance
Fanafzar Game Studio
Back End
•  Fill Rate (ex. 1000 MP/sec)
–  FPS, Overdraw, resolution
–  Fill Rate / FPS = overdraw * resolution
–  Render Target Format (16 / 32 bit)
–  Blending
•  Transparency instead of translucency
–  Shading
•  Pixel shaders
–  Texture Sampling
•  Format, Filter Mode, Count (DXT1)
Fanafzar Game Studio
Front End
•  Bottlenecks
– Vertex Transformation
•  Lighting calculations, skinning, …
– Vertex Fetching and caching
•  Vertex format, indexes (16/32 bit)
– Tessellation
Fanafzar Game Studio
Other GPU factors
•  Multi-sample antialiasing (MSAA)
– Downsample from high-res render
– Can significantly affect fill-rate
•  Lights and Shadows
– CPU, vertex processing, pixel processing
Fanafzar Game Studio
Forward VS. Deferred
•  Multiple render targets needed for
deferred
•  Lot of fill-rate needed for deferred
•  Performance is flattened
Fanafzar Game Studio
Shaders
•  Memory
•  Inter-shader communication
•  Texture sampling (biggest problem with
memory)
•  Computation
Fanafzar Game Studio
Other shader notes
•  Shader compilation
•  Shader count
–  Penalty for many shaders in one scene
–  Limits on GPU for shader execution
•  Effect framework
–  CgFX, ColladaFX (by tools like Nvidia FX
composer)
–  Oriented towards ease of use than performance
–  Engines have their own (Unreal 3, Unity, Source,
torque, Gamebryo)
Fanafzar Game Studio
Networking
•  Throughput
•  Latency
•  Reliability
– Out of order packets
– Corrupted
– Truncated
– Lost
Fanafzar Game Studio
Reliability
•  User Datagram Protocol (UDP)
•  Transmission Control Protocol (TCP)
Fanafzar Game Studio
Game Networking Data
•  Events
– Guaranteed, Ordered
•  State data
– Unordered, Not Guaranteed (opportunities for
optimization)
– Unless using lock step simulation
Fanafzar Game Studio
Bandwidth
•  Bitstreams and Bit packing
– Flag -> one bit
– Health -> 7 bits
•  Encoding on streams
TCP/UDP
BitStream
Decimation LZW Huffman
Most Recent State Events
Fanafzar Game Studio
Prioritizing Data
•  Fill packet with most important data first
•  Heuristic for most recent data (ex. how
close to player)
•  Only send what you must
– ex. Cull enemy behind the wall
Fanafzar Game Studio
Packets
•  Smaller than 1400 bytes
•  Send packets regularly (Routers allocate
bandwidth to those who use it)
Fanafzar Game Studio
Smooth Experience
•  Interpolation
•  Extrapolation
– Client Side Prediction
– Dead Reckoning
Fanafzar Game Studio
Profiling Networking
•  Make sure networking code is efficient
– Measure compute and memory
•  Expose what the networking layer is doing
– Number of packets
– Bandwidth for each packet
•  Be aware of situations that client and
server get out of sync.
Fanafzar Game Studio
Mass Storage
•  Hard Drives
•  CD, DVD
•  Blu-Ray
•  Flash Drives
Fanafzar Game Studio
Performance Issues
•  Seek Time
•  Transfer Rate (ex. 75MB/sec)
•  Worst Case
– 8ms delay between blocks on disk
– 4KB blocks
– Loading 1MB -> (1024/4) * 8 = 2048 ms = 2
secs
– Loading 1GB -> 34 min
Fanafzar Game Studio
Rule
•  No disk IO in the inner loops
Fanafzar Game Studio
IO Profiling is hard
•  File systems optimize themselves based on
access patterns
•  Disk will rebalance data based on load and
sector failure
•  Disk, disk controller, file system and OS will
cache and reorder requests
•  User software may intercept the disk access
for virus scanning
•  Good idea to test on fresh machines from
time to time
Fanafzar Game Studio
Disk IO performance tips
•  Limit disk access
•  Minimize reads and writes
– Read larger chunks
•  Asynchronous Access
•  Optimize file order
•  Optimize data for fast loading
– Space on disk vs. Time to load (ex.
decompressing a JPG file)
Fanafzar Game Studio
Disk IO Tips
•  Support development and runtime formats
•  Support dynamic reloading
•  Automate resource processing
•  Centralize resource loading
–  Resource Managers
•  Preload when appropriate
•  Stream
–  First second of sound in memory
–  Small texture mip levels in memory
–  Small mesh LODs in memory
Fanafzar Game Studio
Concurrent Programming
•  Data Parallelism
– Scatter Phase
– Gather Phase
•  Task Parallelism
Fanafzar Game Studio
Threading Performance
Problems
•  Scalability
•  Contention
•  Balancing
Fanafzar Game Studio
Scalability
•  High performance is proportional to the
parallelizable section of an algorithm
•  Amdahl’s Law
– S(N) = 1 / ((1 – P) + P/N)
– N: Processors, P: Parallelizable Portion
Fanafzar Game Studio
Contention
•  More than one thread accessing the same
resource
•  Some solutions
– Thread Safety (Mutex)
– Redundant Data
– Efficient Synchronization (Locks, Atomic
Operations, …)
Fanafzar Game Studio
Balancing
•  Ensure all cores are busy
•  Eliminate starving
Fanafzar Game Studio
False Sharing
Fanafzar Game Studio
False Sharing Ctd.
Struct vertex
{
float xyz[3]; // data 1
float tutuv[2]; // data 2
};
vertex triList[N];
------------------------------------------------------------
Struct vertices
{
float xyz[3][N];
float tutuv[3][N];
};
vertices triList;
Fanafzar Game Studio
Multi-threaded Profiling
•  Look for time spent on synchronization
primitives
•  Look out for Heisenbugs!
•  Assess Amdahl’s Law
•  Use multi-threaded profilers
Fanafzar Game Studio
No Synchronization is best
•  Lock-free algorithms are great.
•  Wait-free algorithms are event better!!
Mike Acton notes on wait free coding:
http://cellperformance.beyond3d.com/
articles/2009/08/roundup-recent-sketches-
on-concurrency-data-design-and-
performance.html
Fanafzar Game Studio
Managed Languages
•  Execute on a runtime
•  C#, Java, Javascript, lua, python, php,
Actionscript
Fanafzar Game Studio
Concerns for Profiling
•  Garbage Collector
•  Just in Time compiler
•  No high accuracy timers
•  Allocation can be costly, usually no stack
Fanafzar Game Studio
Managed/Unmanaged
•  Gameplay code is usually not performance
critical
•  Bottlenecks can be replaced with native
code
Fanafzar Game Studio
Dealing with GC
•  Memory pressure causes GC to run
frequently and cause sudden hitches
•  Memory pressure causes big memory
footprint and hurts cache efficiency
•  Big total working set needs the GC to
check all the pointers
•  Incremental GC behavior is helpful but
high pressure can force GC to collect all
Fanafzar Game Studio
Strategies for dealing with
GC
•  Less data on heap
•  Your own memory management
•  Memory pooling
•  Using temporary objects that are instances
as class members instead of local variable
creation
Fanafzar Game Studio
Dealing with JIT
•  JIT activation time is important for
performance (startup, after a few function
calls, …)
•  Constructors usually left out (Heavy
initialization code needs to be in a helper
function)
•  JIT might not be available on all platforms
Fanafzar Game Studio
Optimizing Animation
•  Channel Omission
•  Quantization
•  Sample Frequency and Key Omission
•  Curve Based Compression
•  Selective Loading and Streaming
•  Hardware Skinning
Fanafzar Game Studio
Misc. Optimization Related
Topics
•  Mesh LOD
•  Animation LOD
•  AI LOD
•  Collision Detection Spatial Partitioning
•  Physics Optimizations (GPU, Sleeps, …)
Fanafzar Game Studio
PIX Test Case
•  PIX (Performance Investigator for Xbox
•  Part of DirectX SDK
•  Used for DirectX based applications
•  Used for analyzing Garshasp 1 and
Garshasp: Temple of the Dragon
(Expansion)
Fanafzar Game Studio
Using PIX to Analyze
Garshasp
Fanafzar Game Studio
Selecting Measurement
Attributes
Fanafzar Game Studio
In-Game HUD
Fanafzar Game Studio
PIX Report
Fanafzar Game Studio
Garshasp Performance Post-
Mortem
•  Animation skinning (Intel VTune)
–  Switched to Hardware Skinning
•  Asset Loading
–  Used background thread
•  Draw Calls
–  Dynamic Far-Clip distance
•  High RAM consumption
–  Reduced particle quotas
–  Reduced Area arrangement (changes in camera
system needed)
–  Reduced Texture size
–  Better strategies for audio loading/unloading
Fanafzar Game Studio
Garshasp Ctd.
•  Large Video memory usage
– Changed mesh geometry
– Better seamlessness strategy
•  Frame rate drops
– Better use of particles
– Modifications to camera angles and
seamlessness strategy
– Smaller areas for more even distribution of
resource loading.
Fanafzar Game Studio
Some un-resolved issues
•  Un-optimized animation system
•  Overdraw
•  Slow Game Object update loop
•  No static batching
–  Use of vertex color for baked color
•  Huge game save data
•  In-efficient texture size usage
•  No sound/video streaming
•  + may more!
Fanafzar Game Studio
Biggest Optimization
Related Problem
No internal resource consciousness!
Fanafzar Game Studio
Unity Editor Profiler
Fanafzar Game Studio
Profiler Views
Fanafzar Game Studio
CPU
Fanafzar Game Studio
Deep Calls
Fanafzar Game Studio
Rendering Information
Fanafzar Game Studio
Memory
Fanafzar Game Studio
CPU vs. GPU
Fanafzar Game Studio
References
•  Video Game Optimization, Ben Garney and Eric Preisz
•  “How the left and right brain learned to love one another”, Tim Moss
http://timmoss.blogspot.com/2007/02/it-seems-reasonable-that-my-
very-first.html
•  “Optimization is a Full time job”, Maciej Sinilo
http://msinilo.pl/blog/?p=483
•  “Memory Optimizaton”, Christer Ericson,
http://www.research.scea.com/research/pdfs/
GDC2003_Memory_Optimization_18Mar03.pdf
•  “A pragmatic approach to optimization”, Niklas Frykholm,
http://bitsquid.blogspot.com/2011/12/pragmatic-approach-to-
performance.html
Fanafzar Game Studio
References Ctd.
•  Hacker’s Delight (Henry S. Warren, Addison
Wesley 2003)
•  Advanced Bit Manipulation-fu, Christer Ericson
http://realtimecollisiondetection.net/blog/?p=78
•  Networking for Programmers, Glenn Fiedler,
http://gafferongames.com/networking-for-game-
programmers/
•  Source Multiplayer Networking, Valve Software,
https://developer.valvesoftware.com/wiki/
Source_Multiplayer_Networking
Fanafzar Game Studio
References Ctd.
•  False sharing and its effect on memory performance,
William J. Bolosky,
http://static.usenix.org/publications/library/
proceedings/sedms4/full_papers/bolosky.txt
•  Concurrency, Data Design and Performance, Mike
Acton,
http://cellperformance.beyond3d.com/articles/2009/08/
roundup-recent-sketches-on-concurrency-data-design-
and-performance.html
•  Diving down the concurrency rabbit hole, Mike Acton,
http://www.insomniacgames.com/tech/articles/0809/
files/concurrency_rabit_hole.pdf
Fanafzar Game Studio
References Ctd.
•  Scalar Quantization, Jonathan Blow,
http://number-none.com/product/Scalar%20Quantization/
index.html
•  Are we out of memory, Christian Gyrling,
http://www.swedishcoding.com/2008/08/31/
are-we-out-of-memory/
•  Practical Efficient Memory Management,
Jesus De Santos,
http://entland.homelinux.com/blog/
2008/08/19/practical-efficient-memory-
management/
• 
Fanafzar Game Studio
References Ctd.
•  Load Hit Store and the restrict keyword, Elan
Ruskin,
http://assemblyrequired.crashworks.org/
2008/07/08/load-hit-stores-and-the-__restrict-
keyword/
•  How slow are virtual functions really, Elan Ruskin,
http://assemblyrequired.crashworks.org/
2009/01/19/how-slow-are-virtual-functions-really/
•  Current Generation Parallelism in Games, Jon
Olick,
http://s08.idav.ucdavis.edu/olick-current-and-next-
generation-parallelism-in-games.pdf
Fanafzar Game Studio
References Ctd.
•  Real Life Performance Pitfalls, Alan Murphy,
http://www.microsoft.com/en-us/download/
confirmation.aspx?id=3539
•  Graphics Programming Black Book, Michael
Abrash
•  Zen of Code Optimization, Michael Abrash
•  The Free Lunch is Over, Herb Sutter,
http://www.gotw.ca/publications/concurrency-
ddj.htm
Fanafzar Game Studio
References Ctd.
•  Intel Software Optimization Cookbook,
http://www.intel.com/intelpress/sum_swcb2.htm
•  Pitfalls of Objects Oriented Programming, Tony
Albrecht,
http://www.reddit.com/r/programming/comments/
ag43j/
pitfalls_of_object_oriented_programming_pdf/
•  Microsoft PIX,
http://msdn.microsoft.com/en-us/library/
ee663275(v=vs.85).aspx
Fanafzar Game Studio
References Ctd.
•  Top 10 Myths of Video Game
Optimization,
http://www.gamasutra.com/view/feature/
130296/
the_top_10_myths_of_video_game_.php?
print=1
Fanafzar Game Studio
Questions?
fassihi@fanafzar.com
Fanafzar Game Studio

Más contenido relacionado

Destacado

Destacado (20)

Fanafzar
FanafzarFanafzar
Fanafzar
 
از بازی سازی در ایران تا بازی سازی ایرانی
از بازی سازی در ایران تا بازی سازی ایرانیاز بازی سازی در ایران تا بازی سازی ایرانی
از بازی سازی در ایران تا بازی سازی ایرانی
 
بازی سازی ایرانی! توهم؟ واقعیت؟
بازی سازی ایرانی! توهم؟ واقعیت؟بازی سازی ایرانی! توهم؟ واقعیت؟
بازی سازی ایرانی! توهم؟ واقعیت؟
 
بازاریابی بازی‌های ویدیویی
بازاریابی بازی‌های ویدیوییبازاریابی بازی‌های ویدیویی
بازاریابی بازی‌های ویدیویی
 
What are Game Publishers Looking For?
What are Game Publishers Looking For?What are Game Publishers Looking For?
What are Game Publishers Looking For?
 
Video game pitch
Video game pitchVideo game pitch
Video game pitch
 
You have 10 seconds: Understanding how to make your game pitch great
You have 10 seconds: Understanding how to make your game pitch greatYou have 10 seconds: Understanding how to make your game pitch great
You have 10 seconds: Understanding how to make your game pitch great
 
Subtle Anamorphic Lens Effects - Real-time Rendering of Physically Based Opt...
Subtle Anamorphic Lens Effects - Real-time Rendering of Physically Based Opt...Subtle Anamorphic Lens Effects - Real-time Rendering of Physically Based Opt...
Subtle Anamorphic Lens Effects - Real-time Rendering of Physically Based Opt...
 
چگونه موفقیت یا شکست پروژه شما از قبل مشخص شده است
چگونه موفقیت یا شکست پروژه شما از قبل مشخص شده استچگونه موفقیت یا شکست پروژه شما از قبل مشخص شده است
چگونه موفقیت یا شکست پروژه شما از قبل مشخص شده است
 
Startup Studio Pitch - Best Practices
Startup Studio Pitch - Best PracticesStartup Studio Pitch - Best Practices
Startup Studio Pitch - Best Practices
 
جادوی خلاقیت
جادوی خلاقیتجادوی خلاقیت
جادوی خلاقیت
 
The Art of Game Development
The Art of Game DevelopmentThe Art of Game Development
The Art of Game Development
 
بازاریابی بازی‌های موبایل
بازاریابی بازی‌های موبایلبازاریابی بازی‌های موبایل
بازاریابی بازی‌های موبایل
 
Indie Games Developer Pitch Deck template
Indie Games Developer Pitch Deck templateIndie Games Developer Pitch Deck template
Indie Games Developer Pitch Deck template
 
Game Studio Leadership: You Can Do It
Game Studio Leadership: You Can Do ItGame Studio Leadership: You Can Do It
Game Studio Leadership: You Can Do It
 
How Wealthsimple raised $2M in 2 weeks
How Wealthsimple raised $2M in 2 weeksHow Wealthsimple raised $2M in 2 weeks
How Wealthsimple raised $2M in 2 weeks
 
AdPushup Fundraising Deck - First Pitch
AdPushup Fundraising Deck - First PitchAdPushup Fundraising Deck - First Pitch
AdPushup Fundraising Deck - First Pitch
 
Zenpayroll Pitch Deck Template
Zenpayroll Pitch Deck TemplateZenpayroll Pitch Deck Template
Zenpayroll Pitch Deck Template
 
The deck we used to raise $270k for our startup Castle
The deck we used to raise $270k for our startup CastleThe deck we used to raise $270k for our startup Castle
The deck we used to raise $270k for our startup Castle
 
SteadyBudget's Seed Funding Pitch Deck
SteadyBudget's Seed Funding Pitch DeckSteadyBudget's Seed Funding Pitch Deck
SteadyBudget's Seed Funding Pitch Deck
 

Similar a Videogame Optimization

Sony Computer Entertainment Europe Research & Development Division
Sony Computer Entertainment Europe Research & Development DivisionSony Computer Entertainment Europe Research & Development Division
Sony Computer Entertainment Europe Research & Development Division
Slide_N
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
mundlapudi
 

Similar a Videogame Optimization (20)

Supersize Your Production Pipe
Supersize Your Production PipeSupersize Your Production Pipe
Supersize Your Production Pipe
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
 
Maximize Your Production Effort (English)
Maximize Your Production Effort (English)Maximize Your Production Effort (English)
Maximize Your Production Effort (English)
 
Code and memory optimization tricks
Code and memory optimization tricksCode and memory optimization tricks
Code and memory optimization tricks
 
Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks
 
Sony Computer Entertainment Europe Research & Development Division
Sony Computer Entertainment Europe Research & Development DivisionSony Computer Entertainment Europe Research & Development Division
Sony Computer Entertainment Europe Research & Development Division
 
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
 
God Of War : post mortem
God Of War : post mortemGod Of War : post mortem
God Of War : post mortem
 
When Tools Attack
When Tools AttackWhen Tools Attack
When Tools Attack
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
PlayStation: Cutting Edge Techniques
PlayStation: Cutting Edge TechniquesPlayStation: Cutting Edge Techniques
PlayStation: Cutting Edge Techniques
 
Jan Hloušek, Keen Software House
Jan Hloušek, Keen Software HouseJan Hloušek, Keen Software House
Jan Hloušek, Keen Software House
 
Week Eight - Introduction to Hardware
Week Eight - Introduction to HardwareWeek Eight - Introduction to Hardware
Week Eight - Introduction to Hardware
 
.NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov).NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov)
 
Sephy engine development document
Sephy engine development documentSephy engine development document
Sephy engine development document
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
 
Inside the IT Territory game server / Mark Lokshin (IT Territory)
Inside the IT Territory game server / Mark Lokshin (IT Territory)Inside the IT Territory game server / Mark Lokshin (IT Territory)
Inside the IT Territory game server / Mark Lokshin (IT Territory)
 
Basic Optimization and Unity Tips & Tricks by Yogie Aditya
Basic Optimization and Unity Tips & Tricks by Yogie AdityaBasic Optimization and Unity Tips & Tricks by Yogie Aditya
Basic Optimization and Unity Tips & Tricks by Yogie Aditya
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
AWS re:Invent 2016: [REPEAT] How EA Leveraged Amazon Redshift and AWS Partner...
 

Más de Amir H. Fassihi

Más de Amir H. Fassihi (17)

کارگاه چشم‌انداز در بازی سازی
کارگاه چشم‌انداز در بازی سازیکارگاه چشم‌انداز در بازی سازی
کارگاه چشم‌انداز در بازی سازی
 
Planning Chaos - Online Workshop
Planning Chaos - Online WorkshopPlanning Chaos - Online Workshop
Planning Chaos - Online Workshop
 
Planning Chaos
Planning ChaosPlanning Chaos
Planning Chaos
 
کارگاه مبانی بازی سازی
کارگاه مبانی بازی سازیکارگاه مبانی بازی سازی
کارگاه مبانی بازی سازی
 
کارگاه سفر بازی‌ساز
کارگاه سفر بازی‌سازکارگاه سفر بازی‌ساز
کارگاه سفر بازی‌ساز
 
کارگاه مدیریت تیم خلاق ۳
کارگاه مدیریت تیم خلاق ۳کارگاه مدیریت تیم خلاق ۳
کارگاه مدیریت تیم خلاق ۳
 
کارگاه مدیریت تیم خلاق ۲
کارگاه مدیریت تیم خلاق ۲کارگاه مدیریت تیم خلاق ۲
کارگاه مدیریت تیم خلاق ۲
 
مدیریت تیم خلاق ۱
مدیریت تیم خلاق ۱مدیریت تیم خلاق ۱
مدیریت تیم خلاق ۱
 
کسب‌و‌کار بازی‌های ویدیویی ۳
کسب‌و‌کار بازی‌های ویدیویی ۳کسب‌و‌کار بازی‌های ویدیویی ۳
کسب‌و‌کار بازی‌های ویدیویی ۳
 
داستان هیولا
داستان هیولاداستان هیولا
داستان هیولا
 
کسب‌و‌کار بازی‌های ویدیویی ۲
کسب‌و‌کار بازی‌های ویدیویی ۲کسب‌و‌کار بازی‌های ویدیویی ۲
کسب‌و‌کار بازی‌های ویدیویی ۲
 
کسب‌و‌کار بازی‌های ویدیویی
کسب‌و‌کار بازی‌های ویدیوییکسب‌و‌کار بازی‌های ویدیویی
کسب‌و‌کار بازی‌های ویدیویی
 
بازی‌سازی در فن‌افزار - ۱۳۹۶
بازی‌سازی در فن‌افزار - ۱۳۹۶بازی‌سازی در فن‌افزار - ۱۳۹۶
بازی‌سازی در فن‌افزار - ۱۳۹۶
 
کارگاه کار تیمی
کارگاه کار تیمیکارگاه کار تیمی
کارگاه کار تیمی
 
تیم ایرانی و مهر
تیم ایرانی و مهرتیم ایرانی و مهر
تیم ایرانی و مهر
 
رازهای بهترین تیم های بازی ساز
رازهای بهترین تیم های بازی سازرازهای بهترین تیم های بازی ساز
رازهای بهترین تیم های بازی ساز
 
Game Ecosystem in Iran
Game Ecosystem in IranGame Ecosystem in Iran
Game Ecosystem in Iran
 

Último

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Último (20)

NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 

Videogame Optimization

  • 1. Video Game Optimization Workshop Amir H. Fassihi Fanafzar Game Studio Aug 2012 Fanafzar Game Studio
  • 2. System Design Requirements •  Functional •  Non Functional Fanafzar Game Studio
  • 3. Non Functional Requirements •  Maintainability •  Extensibility •  Security •  Scalability •  Intellectual Manageability •  Availability •  Portability •  Usability •  Performance Fanafzar Game Studio
  • 4. Performance The amount of work accomplished by a computer system compared to the time and resources used. •  Short response time •  High throughput •  Low utilization of computer resources •  High availability of applications •  Fast data compression and decompression •  High bandwidth/ Short data transmission time Fanafzar Game Studio
  • 5. Video Games •  Most x-abilities are important – Even more so for game engines. (As in enterprise applications) •  Performance is REALLY important! – For any game or game engine. Fanafzar Game Studio
  • 6. System Design •  Solution for Functional Requirements •  Solution for Non-Functional Requirements – Bulk of the technical efforts – Conflicts in Design! – Performance as the bad boy in the group – Performance as the cream of the crop – Performance being directly experienced by end user Fanafzar Game Studio
  • 7. Can you make this? Fanafzar Game Studio
  • 8. Optimization •  “The process of modifying a software system to make some aspects of it work more efficiently or use fewer resources.” Fanafzar Game Studio
  • 9. Optimization Lifecycle 1.  Benchmark 2.  Detect (Hotspots and Bottlenecks) 3.  Solve 4.  Check 5.  Goto 1 Fanafzar Game Studio
  • 10. Levels of Optimization •  System Level •  Algorithmic Level •  Micro Level – Branch prediction – Instruction throughput – Latency Fanafzar Game Studio
  • 11. Project Lifecycle and Optimization •  Pre-production •  Production •  Post-production Optimization from High Level to Low Level Quake Story: High level architectural optimization before low level triangle draw function (Carmack and Abrash) http://www.bluesnews.com/abrash/ Fanafzar Game Studio
  • 12. Measuring Performance in Games 1.  Set Specification 1.  Performance Goal (FPS, time) 2.  Hardware Specification 2.  Define Line Items 1.  CPU time, RAM, GPU time, Video Mem 2.  Rendering, Physics, Sound, Gameplay, Misc. Fanafzar Game Studio
  • 13. Memory Management (God of War) 32 Meg memory 16 Meg for Levels, split into 24*1 Meg Enemies 1.5 Meg Exe Run Time Data Perm Data •  Establish Hard Rules. –  16 Meg for Level Data (Split into 2 Levels) –  4 * 1 Meg for Enemies •  Maintain 60fps From: Tim Moss 2006 GDC Talk Fanafzar Game Studio
  • 14. Tools •  Profilers (Intel VTune, VS Profiler, …) – Total time – Self time – Calls •  System Monitors (Nvidia PerfHud, MS PIX,…) •  System Adjusters (Intel GPA, …) Fanafzar Game Studio
  • 15. Holistic Optimization •  Optimization Process •  CPU Bound •  GPU Bound Fanafzar Game Studio
  • 16. CPU Bound, Memory •  Prefetching Memory •  Memory Cache Fanafzar Game Studio
  • 17. Memory Optimization •  Cache Miss – Instruction Cache – Data Cache Fanafzar Game Studio
  • 18. Memory Hierarchy source: Memory Optimization, Christer Ericson, GDC 2003Fanafzar Game Studio
  • 19. Data Access Patterns •  Linear Access Forward for (i = 0; i < numData; ++i) memArray[i]; •  Linear Access Backward Fanafzar Game Studio
  • 20. Data Access Patterns Ctd. •  Periodic Access struct vertex { float pos[3]; float norm[3]; float textCoord[3]; } for (i = 0; i < num; ++i) vertexArray[i].pos •  Random Access Fanafzar Game Studio
  • 21. AOS vs. SOA Fanafzar Game Studio
  • 22. Critical Stride •  Stride size in memory read can cause cache thrashing Fanafzar Game Studio
  • 23. Strip Mining for { access pos; } for { access norm; } ------------------------------------------------------ for { access pos; access norm; } Fanafzar Game Studio
  • 24. Memory •  Stack – Temporal coherence, spatial locality •  Global – No fragmentation, freed at end •  Heap – new, delete, malloc, free – No spatial locality, no temporal coherence, fragmentation Fanafzar Game Studio
  • 25. Load-Hit-Store •  Write data to address x and then read the data from address x -> Large stall •  Writing data all the way to the main memory through all caches -> 40 to 80 CPU cycle delay •  http://assemblyrequired.crashworks.org/ 2008/07/08/load-hit-stores-and-the- __restrict-keyword/ Fanafzar Game Studio
  • 27. Memory Solutions •  Don’t allocate •  Linearize allocations – Use arrays •  Memory pools – Coherent – No fragmentation – No construction/destruction •  Don’t construct or destruct – Plain Old Structures (POS) Fanafzar Game Studio
  • 28. Memory Solutions •  Time scoped pools – Frame allocator – Pool for one level content, discarded at the end Fanafzar Game Studio
  • 29. Memory Manager “If you don’t have a custom memory manager in your game, you’re a fool (or a PC game developer)” Christer Ericson, Director of Tools and Technology, Sony Santa Monica Fanafzar Game Studio
  • 30. Memory Related Solutions •  Reducing memory footprint at compile time and runtime •  Algorithms that reduce memory fetching •  Reduce cache miss –  Spatial Locality –  Proper Stride –  Correct Alignment •  Increase Temporal Coherence •  Utilize Pre-fetching •  Avoid worst-case access patterns that break caching Fanafzar Game Studio
  • 31. Pitfalls of Object Oriented Programming Summary of study (Tony Albrecht, 2009) •  Case study for CPU side rendering code •  Just re-organizing data locations was a win •  + pre-fetching is more win •  Can you decouple data from objects? •  Be aware of what the compiler and hardware are doing, watch the generated assembly! Fanafzar Game Studio
  • 32. Pitfalls of OOP •  Optimize for data first, then code – Memory access is going to be your biggest bottleneck •  Simplify Systems – KISS – Easier to optimize, Easier to parallelize •  Keep code and data homogeneous •  Not everything needs to be an object Fanafzar Game Studio
  • 33. Pitfalls of OOP •  You are writing a game – You have control over the input data – Don’t be afraid to pre-format it if needed •  Design for specifics, not generics Fanafzar Game Studio
  • 34. Data Oriented Design •  Better performance •  Better realization of code optimization •  Often simpler code •  More parallelizable code Fanafzar Game Studio
  • 35. CPU Bound: Compute •  Lots of arithmetic operations not load and store Fanafzar Game Studio
  • 36. CPU Compute: Solutions •  Compiler flags (float: precise/fast) •  Time against Space –  Use of lookup tables •  Memoization •  Function Inlining •  Branch prediction, out of order execution –  Branch mis-prediction is much less costly than cache miss •  Make branches more predictable Fanafzar Game Studio
  • 37. CPU Computer: Solutions •  Remove Branches – If (a) z=c; else z=d; – Z = a * c + (1 – a) * d •  Profile Guided Optimization •  Loop unrolling Fanafzar Game Studio
  • 38. Loop Unrolling for (i = 0; i < 100; ++i) sum += intArray[i]; ------------------------------------------------------ for (i = 0; i < 100; i+=4) { sum1 += intArray[i]; sum2 += intArray[i+1]; sum3 += intArray[i+2]; sum4 += intArray[i+3]; } sum = sum1+sum2+sum3+sum4; Fanafzar Game Studio
  • 39. Virtual Functions •  How slow are virtual functions really? http://assemblyrequired.crashworks.org/2009/01/19/how-slow-are- virtual-functions-really/ •  1000 iterations over 1024 vectors •  12,288,000 function calls •  Virtual: 159.856 ms •  Direct: 67.962 •  Inline: 8.040 ms Fanafzar Game Studio
  • 40. Slow Virtual Functions •  Problem is not the cost of looking up the indirect function pointer from vtable. •  The issue lies in “branch prediction” and the way marshalling parameters for the calling convention can get in the way of good instruction scheduling. Fanafzar Game Studio
  • 41. Micro Optimization •  Bit Tricks –  Bitwise Swap •  X^=Y; Y^=X; X^=Y; –  Bitmasks •  isFlagSet = someInt & MY_FLAG, someInt |= Flag2; •  Example use: Collisions in Physics –  Fast Modulo •  X%Y = X & (Y -1) iff Y is a power of 2 –  Even and Odd •  (X & 1) == 0; // same as X%2==0 Fanafzar Game Studio
  • 42. Book on Bit Tricks •  Hacker’s Delight (Henry S. Warren, Addison Wesley, 2003) Fanafzar Game Studio
  • 43. Other Micro Optimization •  Data type conversion •  SSE Instructions •  Removing loop invariant code •  Loop unrolling •  Cross-.obj optimization – Whole program optimization •  Hardware Specific Optimizations Fanafzar Game Studio
  • 44. Vector vs. List •  Random data insertion and deletion into a c++ vector and list compared •  Data kept sorted in the containers Fanafzar Game Studio
  • 45. Vector vs. List Results Fanafzar Game Studio
  • 46. Vector vs. List Ctd. Fanafzar Game Studio
  • 47. STL iterator debugging STL Iterator Debugging and Secure SCL http://channel9.msdn.com/Shows/Going +Deep/STL-Iterator-Debugging-and-Secure- SCL Fanafzar Game Studio
  • 48. Copy vs. Move •  Vector of strings with 4 dimensions •  100 x 100 x 100 x 500 •  Construction: 564 ms •  Copy Construction: 537 ms •  Move Construction: 0.001 ms •  Empty Destruction: 0.001 ms •  Destruction: 285 ms Fanafzar Game Studio
  • 49. GPU Bound •  GPU related issues –  Synchronization –  Capabilities Management –  Resource Management –  Global Ordering •  Reflections/Shadows before scene •  Opaque front to back/Translucent back to front •  Sort by material or texture to reduce state changes –  Instrumentation –  Debugging Fanafzar Game Studio
  • 50. GPU Optimization Tricks •  State Changes •  Draw Call (Most common issue) •  Instancing and Batching –  Shader Instancing –  Hardware Instancing •  Video RAM –  Device Resets –  Resource uploads/locks •  Minimize Copies •  Minimize Locks •  Double Buffer Fanafzar Game Studio
  • 51. GPU Optimization Ctd. •  Fragmentation – Power of 2 allocations help •  Lock culling – Debug visualization for those culled •  Texture debugging – Different texture for each mip level Fanafzar Game Studio
  • 52. GPU Bound? •  Spend a long time in API calls (Draw calls or swap/present frame buffer) •  Front End / Back End – Triangles/Geometry – Pixels/Shaders – Vary each workload and measure performance Fanafzar Game Studio
  • 53. Back End •  Fill Rate (ex. 1000 MP/sec) –  FPS, Overdraw, resolution –  Fill Rate / FPS = overdraw * resolution –  Render Target Format (16 / 32 bit) –  Blending •  Transparency instead of translucency –  Shading •  Pixel shaders –  Texture Sampling •  Format, Filter Mode, Count (DXT1) Fanafzar Game Studio
  • 54. Front End •  Bottlenecks – Vertex Transformation •  Lighting calculations, skinning, … – Vertex Fetching and caching •  Vertex format, indexes (16/32 bit) – Tessellation Fanafzar Game Studio
  • 55. Other GPU factors •  Multi-sample antialiasing (MSAA) – Downsample from high-res render – Can significantly affect fill-rate •  Lights and Shadows – CPU, vertex processing, pixel processing Fanafzar Game Studio
  • 56. Forward VS. Deferred •  Multiple render targets needed for deferred •  Lot of fill-rate needed for deferred •  Performance is flattened Fanafzar Game Studio
  • 57. Shaders •  Memory •  Inter-shader communication •  Texture sampling (biggest problem with memory) •  Computation Fanafzar Game Studio
  • 58. Other shader notes •  Shader compilation •  Shader count –  Penalty for many shaders in one scene –  Limits on GPU for shader execution •  Effect framework –  CgFX, ColladaFX (by tools like Nvidia FX composer) –  Oriented towards ease of use than performance –  Engines have their own (Unreal 3, Unity, Source, torque, Gamebryo) Fanafzar Game Studio
  • 59. Networking •  Throughput •  Latency •  Reliability – Out of order packets – Corrupted – Truncated – Lost Fanafzar Game Studio
  • 60. Reliability •  User Datagram Protocol (UDP) •  Transmission Control Protocol (TCP) Fanafzar Game Studio
  • 61. Game Networking Data •  Events – Guaranteed, Ordered •  State data – Unordered, Not Guaranteed (opportunities for optimization) – Unless using lock step simulation Fanafzar Game Studio
  • 62. Bandwidth •  Bitstreams and Bit packing – Flag -> one bit – Health -> 7 bits •  Encoding on streams TCP/UDP BitStream Decimation LZW Huffman Most Recent State Events Fanafzar Game Studio
  • 63. Prioritizing Data •  Fill packet with most important data first •  Heuristic for most recent data (ex. how close to player) •  Only send what you must – ex. Cull enemy behind the wall Fanafzar Game Studio
  • 64. Packets •  Smaller than 1400 bytes •  Send packets regularly (Routers allocate bandwidth to those who use it) Fanafzar Game Studio
  • 65. Smooth Experience •  Interpolation •  Extrapolation – Client Side Prediction – Dead Reckoning Fanafzar Game Studio
  • 66. Profiling Networking •  Make sure networking code is efficient – Measure compute and memory •  Expose what the networking layer is doing – Number of packets – Bandwidth for each packet •  Be aware of situations that client and server get out of sync. Fanafzar Game Studio
  • 67. Mass Storage •  Hard Drives •  CD, DVD •  Blu-Ray •  Flash Drives Fanafzar Game Studio
  • 68. Performance Issues •  Seek Time •  Transfer Rate (ex. 75MB/sec) •  Worst Case – 8ms delay between blocks on disk – 4KB blocks – Loading 1MB -> (1024/4) * 8 = 2048 ms = 2 secs – Loading 1GB -> 34 min Fanafzar Game Studio
  • 69. Rule •  No disk IO in the inner loops Fanafzar Game Studio
  • 70. IO Profiling is hard •  File systems optimize themselves based on access patterns •  Disk will rebalance data based on load and sector failure •  Disk, disk controller, file system and OS will cache and reorder requests •  User software may intercept the disk access for virus scanning •  Good idea to test on fresh machines from time to time Fanafzar Game Studio
  • 71. Disk IO performance tips •  Limit disk access •  Minimize reads and writes – Read larger chunks •  Asynchronous Access •  Optimize file order •  Optimize data for fast loading – Space on disk vs. Time to load (ex. decompressing a JPG file) Fanafzar Game Studio
  • 72. Disk IO Tips •  Support development and runtime formats •  Support dynamic reloading •  Automate resource processing •  Centralize resource loading –  Resource Managers •  Preload when appropriate •  Stream –  First second of sound in memory –  Small texture mip levels in memory –  Small mesh LODs in memory Fanafzar Game Studio
  • 73. Concurrent Programming •  Data Parallelism – Scatter Phase – Gather Phase •  Task Parallelism Fanafzar Game Studio
  • 74. Threading Performance Problems •  Scalability •  Contention •  Balancing Fanafzar Game Studio
  • 75. Scalability •  High performance is proportional to the parallelizable section of an algorithm •  Amdahl’s Law – S(N) = 1 / ((1 – P) + P/N) – N: Processors, P: Parallelizable Portion Fanafzar Game Studio
  • 76. Contention •  More than one thread accessing the same resource •  Some solutions – Thread Safety (Mutex) – Redundant Data – Efficient Synchronization (Locks, Atomic Operations, …) Fanafzar Game Studio
  • 77. Balancing •  Ensure all cores are busy •  Eliminate starving Fanafzar Game Studio
  • 79. False Sharing Ctd. Struct vertex { float xyz[3]; // data 1 float tutuv[2]; // data 2 }; vertex triList[N]; ------------------------------------------------------------ Struct vertices { float xyz[3][N]; float tutuv[3][N]; }; vertices triList; Fanafzar Game Studio
  • 80. Multi-threaded Profiling •  Look for time spent on synchronization primitives •  Look out for Heisenbugs! •  Assess Amdahl’s Law •  Use multi-threaded profilers Fanafzar Game Studio
  • 81. No Synchronization is best •  Lock-free algorithms are great. •  Wait-free algorithms are event better!! Mike Acton notes on wait free coding: http://cellperformance.beyond3d.com/ articles/2009/08/roundup-recent-sketches- on-concurrency-data-design-and- performance.html Fanafzar Game Studio
  • 82. Managed Languages •  Execute on a runtime •  C#, Java, Javascript, lua, python, php, Actionscript Fanafzar Game Studio
  • 83. Concerns for Profiling •  Garbage Collector •  Just in Time compiler •  No high accuracy timers •  Allocation can be costly, usually no stack Fanafzar Game Studio
  • 84. Managed/Unmanaged •  Gameplay code is usually not performance critical •  Bottlenecks can be replaced with native code Fanafzar Game Studio
  • 85. Dealing with GC •  Memory pressure causes GC to run frequently and cause sudden hitches •  Memory pressure causes big memory footprint and hurts cache efficiency •  Big total working set needs the GC to check all the pointers •  Incremental GC behavior is helpful but high pressure can force GC to collect all Fanafzar Game Studio
  • 86. Strategies for dealing with GC •  Less data on heap •  Your own memory management •  Memory pooling •  Using temporary objects that are instances as class members instead of local variable creation Fanafzar Game Studio
  • 87. Dealing with JIT •  JIT activation time is important for performance (startup, after a few function calls, …) •  Constructors usually left out (Heavy initialization code needs to be in a helper function) •  JIT might not be available on all platforms Fanafzar Game Studio
  • 88. Optimizing Animation •  Channel Omission •  Quantization •  Sample Frequency and Key Omission •  Curve Based Compression •  Selective Loading and Streaming •  Hardware Skinning Fanafzar Game Studio
  • 89. Misc. Optimization Related Topics •  Mesh LOD •  Animation LOD •  AI LOD •  Collision Detection Spatial Partitioning •  Physics Optimizations (GPU, Sleeps, …) Fanafzar Game Studio
  • 90. PIX Test Case •  PIX (Performance Investigator for Xbox •  Part of DirectX SDK •  Used for DirectX based applications •  Used for analyzing Garshasp 1 and Garshasp: Temple of the Dragon (Expansion) Fanafzar Game Studio
  • 91. Using PIX to Analyze Garshasp Fanafzar Game Studio
  • 95. Garshasp Performance Post- Mortem •  Animation skinning (Intel VTune) –  Switched to Hardware Skinning •  Asset Loading –  Used background thread •  Draw Calls –  Dynamic Far-Clip distance •  High RAM consumption –  Reduced particle quotas –  Reduced Area arrangement (changes in camera system needed) –  Reduced Texture size –  Better strategies for audio loading/unloading Fanafzar Game Studio
  • 96. Garshasp Ctd. •  Large Video memory usage – Changed mesh geometry – Better seamlessness strategy •  Frame rate drops – Better use of particles – Modifications to camera angles and seamlessness strategy – Smaller areas for more even distribution of resource loading. Fanafzar Game Studio
  • 97. Some un-resolved issues •  Un-optimized animation system •  Overdraw •  Slow Game Object update loop •  No static batching –  Use of vertex color for baked color •  Huge game save data •  In-efficient texture size usage •  No sound/video streaming •  + may more! Fanafzar Game Studio
  • 98. Biggest Optimization Related Problem No internal resource consciousness! Fanafzar Game Studio
  • 105. CPU vs. GPU Fanafzar Game Studio
  • 106. References •  Video Game Optimization, Ben Garney and Eric Preisz •  “How the left and right brain learned to love one another”, Tim Moss http://timmoss.blogspot.com/2007/02/it-seems-reasonable-that-my- very-first.html •  “Optimization is a Full time job”, Maciej Sinilo http://msinilo.pl/blog/?p=483 •  “Memory Optimizaton”, Christer Ericson, http://www.research.scea.com/research/pdfs/ GDC2003_Memory_Optimization_18Mar03.pdf •  “A pragmatic approach to optimization”, Niklas Frykholm, http://bitsquid.blogspot.com/2011/12/pragmatic-approach-to- performance.html Fanafzar Game Studio
  • 107. References Ctd. •  Hacker’s Delight (Henry S. Warren, Addison Wesley 2003) •  Advanced Bit Manipulation-fu, Christer Ericson http://realtimecollisiondetection.net/blog/?p=78 •  Networking for Programmers, Glenn Fiedler, http://gafferongames.com/networking-for-game- programmers/ •  Source Multiplayer Networking, Valve Software, https://developer.valvesoftware.com/wiki/ Source_Multiplayer_Networking Fanafzar Game Studio
  • 108. References Ctd. •  False sharing and its effect on memory performance, William J. Bolosky, http://static.usenix.org/publications/library/ proceedings/sedms4/full_papers/bolosky.txt •  Concurrency, Data Design and Performance, Mike Acton, http://cellperformance.beyond3d.com/articles/2009/08/ roundup-recent-sketches-on-concurrency-data-design- and-performance.html •  Diving down the concurrency rabbit hole, Mike Acton, http://www.insomniacgames.com/tech/articles/0809/ files/concurrency_rabit_hole.pdf Fanafzar Game Studio
  • 109. References Ctd. •  Scalar Quantization, Jonathan Blow, http://number-none.com/product/Scalar%20Quantization/ index.html •  Are we out of memory, Christian Gyrling, http://www.swedishcoding.com/2008/08/31/ are-we-out-of-memory/ •  Practical Efficient Memory Management, Jesus De Santos, http://entland.homelinux.com/blog/ 2008/08/19/practical-efficient-memory- management/ •  Fanafzar Game Studio
  • 110. References Ctd. •  Load Hit Store and the restrict keyword, Elan Ruskin, http://assemblyrequired.crashworks.org/ 2008/07/08/load-hit-stores-and-the-__restrict- keyword/ •  How slow are virtual functions really, Elan Ruskin, http://assemblyrequired.crashworks.org/ 2009/01/19/how-slow-are-virtual-functions-really/ •  Current Generation Parallelism in Games, Jon Olick, http://s08.idav.ucdavis.edu/olick-current-and-next- generation-parallelism-in-games.pdf Fanafzar Game Studio
  • 111. References Ctd. •  Real Life Performance Pitfalls, Alan Murphy, http://www.microsoft.com/en-us/download/ confirmation.aspx?id=3539 •  Graphics Programming Black Book, Michael Abrash •  Zen of Code Optimization, Michael Abrash •  The Free Lunch is Over, Herb Sutter, http://www.gotw.ca/publications/concurrency- ddj.htm Fanafzar Game Studio
  • 112. References Ctd. •  Intel Software Optimization Cookbook, http://www.intel.com/intelpress/sum_swcb2.htm •  Pitfalls of Objects Oriented Programming, Tony Albrecht, http://www.reddit.com/r/programming/comments/ ag43j/ pitfalls_of_object_oriented_programming_pdf/ •  Microsoft PIX, http://msdn.microsoft.com/en-us/library/ ee663275(v=vs.85).aspx Fanafzar Game Studio
  • 113. References Ctd. •  Top 10 Myths of Video Game Optimization, http://www.gamasutra.com/view/feature/ 130296/ the_top_10_myths_of_video_game_.php? print=1 Fanafzar Game Studio