Unity Internals: Memory and Performance

21.879 visualizaciones

Publicado el

In this presentation we will provide in-depth knowledge about the Unity runtime. The first part will focus on memory and how to deal with fragmentation and garbage collection. The second part on performance profiling and optimizations. Finally, there will be an overview of debugging and profiling improvements in the newly announced Unity 5.0.

0 comentarios
57 recomendaciones
Estadísticas
Notas
  • Sé el primero en comentar

Sin descargas
Visualizaciones
Visualizaciones totales
21.879
En SlideShare
0
De insertados
0
Número de insertados
708
Acciones
Compartido
0
Descargas
310
Comentarios
0
Recomendaciones
57
Insertados 0
No insertados

No hay notas en la diapositiva.

Unity Internals: Memory and Performance

  1. 1. Unity Internals: Memory and Performance Moscow, 16/05/2014 Marco Trivellato – Field Engineer
  2. 2. Page6/9/14 2 This Talk Goals and Benefits
  3. 3. Page Who Am I ? •  Now Field Engineer @ Unity •  Previously, Software Engineer •  Mainly worked on game engines •  Shipped several video games: •  Captain America: Super Soldier •  FIFA ‘07 – FIFA ’10 •  Fight Night: Round 3 6/9/14 3
  4. 4. Page Topics •  Memory Overview •  Garbage Collection •  Mesh Internals •  Scripting •  Job System •  How to use the Profiler 6/9/14 4
  5. 5. Page6/9/14 5 Memory Overview
  6. 6. Page Memory Domains •  Native (internal) •  Asset Data: Textures, AudioClips, Meshes •  Game Objects & Components: Transform, etc.. •  Engine Internals: Managers, Rendering, Physics, etc.. •  Managed - Mono •  Script objects (Managed dlls) •  Wrappers for Unity objects: Game objects, assets, components •  Native Dlls •  User’s dlls and external dlls (for example: DirectX) 6/9/14 6
  7. 7. Page Native Memory: Internal Allocators •  Default •  GameObject •  Gfx •  Profiler 5.x: We are considering to expose an API for using a native allocator in Dlls 6/9/14 7
  8. 8. Page Managed Memory •  Value types (bool, int, float, struct, ...) •  Exist in stack memory. De-allocated when removed from the stack. No Garbage. •  Reference types (classes) •  Exist on the heap and are handled by the mono/.net GC. Removed when no longer being referenced. •  Wrappers for Unity Objects : •  GameObject •  Assets : Texture2D, AudioClip, Mesh, … •  Components : MeshRenderer, Transform, MonoBehaviour 6/9/14 8
  9. 9. Page Mono Memory Internals •  Allocates system heap blocks for internal allocator •  Will allocate new heap blocks when needed •  Heap blocks are kept in Mono for later use •  Memory can be given back to the system after a while •  …but it depends on the platform è don’t count on it •  Garbage collector cleans up •  Fragmentation can cause new heap blocks even though memory is not exhausted 6/9/14 9
  10. 10. Page6/9/14 10 Garbage Collection
  11. 11. Page Unity Object wrapper •  Some Objects used in scripts have large native backing memory in unity •  Memory not freed until Finalizers have run 6/9/14 11 WWW Decompression buffer Compressed file Decompressed file Managed Native
  12. 12. Page Mono Garbage Collection •  GC.Collect •  Runs on the main thread when •  Mono exhausts the heap space •  Or user calls System.GC.Collect() •  Finalizers •  Run on a separate thread •  Controlled by mono •  Can have several seconds delay •  Unity native memory •  Dispose() cleans up internal memory •  Eventually called from finalizer •  Manually call Dispose() to cleanup 6/9/14 12 Main thread Finalizer thread www = null; new(someclass); //no more heap -> GC.Collect(); www.Dispose(); .....
  13. 13. Page Garbage Collection •  Roots are not collected in a GC.Collect •  Thread stacks •  CPU Registers •  GC Handles (used by Unity to hold onto managed objects) •  Static variables!! •  Collection time scales with managed heap size •  The more you allocate, the slower it gets 6/9/14 13
  14. 14. Page GC: does lata layout matter ? struct Stuff { int a; float b; bool c; string leString; } Stuff[] arrayOfStuff; << Everything is scanned. GC takes more time VS int[] As; float[] Bs; bool[] Cs; string[] leStrings; << Only this is scanned. GC takes less time. 6/9/14 14
  15. 15. Page GC: Best Practices •  Reuse objects è Use object pools •  Prefer stack-based allocations è Use struct instead of class •  System.GC.Collect can be used to trigger collection •  Calling it 6 times returns the unused memory to the OS •  Manually call Dispose to cleanup immediately 6/9/14 15
  16. 16. Page Avoid temp allocations •  Don’t use FindObjects or LINQ •  Use StringBuilder for string concatenation •  Reuse large temporary work buffers •  ToString() •  .tag è use CompareTag() instead 6/9/14 16
  17. 17. Page Unity API Temporary Allocations Some Examples: •  GetComponents<T> •  Vector3[] Mesh.vertices •  Camera[] Camera.allCameras •  foreach •  does not allocate by definition •  However, there can be a small allocation, depending on the implementation of .GetEnumerator() 5.x: We are working on new non-allocating versions 6/9/14 17
  18. 18. Page Memory fragmentation •  Memory fragmentation is hard to account for •  Fully unload dynamically allocated content •  Switch to a blank scene before proceeding to next level •  This scene could have a hook where you may pause the game long enough to sample if there is anything significant in memory •  Ensure you clear out variables so GC.Collect will remove as much as possible •  Avoid allocations where possible •  Reuse objects where possible within a scene play •  Clear them out for map load to clean the memory 6/9/14 18
  19. 19. Page Unloading Unused Assets •  Resources.UnloadUnusedAssets will trigger asset garbage collection •  It looks for all unreferenced assets and unloads them •  It’s an async operation •  It’s called internally after loading a level •  Resources.UnloadAsset is preferable •  you need to know exactly what you need to Unload •  Unity does not have to scan everything •  Unity 5.0: Multi-threaded asset garbage collection 6/9/14 19
  20. 20. Page6/9/14 20 Mesh Internals Memory vs. Cycles
  21. 21. Page Mesh Read/Write Option •  It allows you to modify the mesh at run-time •  If enabled, a system-copy of the Mesh will remain in memory •  It is enabled by default •  In some cases, disabling this option will not reduce the memory usage •  Skinned meshes •  iOS Unity 5.0: disable by default – under consideration 6/9/14 21
  22. 22. Page Non-Uniform scaled Meshes We need to correctly transform vertex normals •  Unity 4.x: •  transform the mesh on the CPU •  create an extra copy of the data •  Unity 5.0 •  Scaled on GPU •  Extra memory no longer needed 6/9/14 22
  23. 23. Page Static Batching What is it ? •  It’s an optimization that reduces number of draw calls and state changes How do I enable it ? •  In the player settings + Tag the object as static 6/9/14 23
  24. 24. Page Static Batching How does it work internally ? •  Build-time: Vertices are transformed to world- space •  Run-time: Index buffer is created with indices of visible objects Unity 5.0: •  Re-implemented static batching without copying of index buffers 6/9/14 24
  25. 25. Page Dynamic Batching What is it ? •  Similar to Static Batching but it batches non-static objects at run-time How do I enable it ? •  In the player settings •  no need to tag. it auto-magically works… 6/9/14 25
  26. 26. Page Dynamic Batching How does it work internally ? •  objects are transformed to world space on the CPU •  Temporary VB & IB are created •  Rendered in one draw call Unity 5.x: we are considering to expose per-platform parameters 6/9/14 26
  27. 27. Page Mesh Skinning Different Implementations depending on platform: •  x86: SSE •  iOS/Android/WP8: Neon optimizations •  D3D11/XBoxOne/GLES3.0: GPU •  XBox360, WiiU: GPU (memexport) •  PS3: SPU •  WiiU: GPU w/ stream out Unity 5.0: Skinned meshes use less memory by sharing index buffers between instances 6/9/14 27
  28. 28. Page6/9/14 28 Scripting
  29. 29. Page Unity 5.0: Mono •  No upgrade •  Mainly bug fixes •  New tech in WebGL: IL2CPP •  http://blogs.unity3d.com/2014/04/29/on-the-future-of- web-publishing-in-unity/ •  Stay tuned: there will be a blog post about it 6/9/14 29
  30. 30. Page GetComponent<T> It asks the GameObject, for a component of the specified type: •  The GO contains a list of Components •  Each Component type is compared to T •  The first Component of type T (or that derives from T), will be returned to the caller •  Not too much overhead but it still needs to call into native code 6/9/14 30
  31. 31. Page Unity 5.0: Property Accessors •  Most accessors will be removed in Unity 5.0 •  The objective is to reduce dependencies, therefore improve modularization •  Transform will remain •  Existing scripts will be converted. Example: in 5.0: 6/9/14 31
  32. 32. Page Transform Component •  this.transform is the same as GetComponent<Transform>() •  transform.position/rotation needs to: •  find Transform component •  Traverse hierarchy to calculate absolute position •  Apply translation/rotation •  transform internally stores the position relative to the parent •  transform.localPosition = new Vector(…) è simple assignment •  transform.position = new Vector(…) è costs the same if no father, otherwise it will need to traverse the hierarchy up to transform the abs position into local •  finally, other components (collider, rigid body, light, camera, etc..) will be notified via messages 6/9/14 32
  33. 33. Page Instantiate API: •  Object Instantiate(Object, Vector3, Quaternion); •  Object Instantiate(Object); Implementation: •  Clone GameObject Hierarchy and Components •  Copy Properties •  Awake •  Apply new Transform (if provided) 6/9/14 33
  34. 34. Page Instantiate cont..ed •  Awake can be expensive •  AwakeFromLoad (main thread) •  clear states •  internal state caching •  pre-compute Unity 5.0: •  Allocations have been reduced •  Some inner loops for copying the data have been optimized 6/9/14 34
  35. 35. Page JIT Compilation What is it ? •  The process in which machine code is generated from CIL code during the application's run-time Pros: •  It generates optimized code for the current platform Cons: •  Each time a method is called for the first time, the application will suffer a certain performance penalty because of the compilation 6/9/14 35
  36. 36. Page JIT compilation spikes What about pre-JITting ? •  RuntimeHelpers.PrepareMethod does not work: …better to use MethodHandle.GetFunctionPointer() 6/9/14 36
  37. 37. Page6/9/14 37 Job System
  38. 38. Page Unity 5.0: Job System (internal) The goals of the job system: •  make it easy to write very efficient job based multithreaded code •  The jobs should be able to run safely in parallel to script code 6/9/14 38
  39. 39. Page Job System: Why ? Modern architectures are multi-core: •  XBox 360: 3 cores •  PS4/Xbox One: 8 cores …which includes mobile devices: •  iPhone 4S: 2 cores •  Galaxy S3: 4 cores 6/9/14 39
  40. 40. Page Job System: What is it ? •  It’s a Framework that we are going to use in existing and new sub-systems •  We want to have Animation, NavMesh, Occlusion, Rendering, etc… run as much as possible in parallel •  This will ultimately lead to better performance 6/9/14 40
  41. 41. Page Unity 5.0: Profiler Timeline View It’s a tool that allows you to analyse internal (native) threads execution of a specific frame 6/9/14 41
  42. 42. Page Unity 5.0: Frame Debugger 6/9/14 42
  43. 43. Page6/9/14 43 Conclusions
  44. 44. Page Budgeting Memory How much memory is available ? •  It depends… •  For example, on 512mb devices running iOS 6.0: ~250mb. A bit less with iOS 7.0 What’s the baseline ? •  Create an empty scene and measure memory •  Don’t forget that the profiler requires some memory •  For example: on Android 15.5mb (+ 12mb profiler) 6/9/14 44
  45. 45. Page Profiling •  Don’t make assumptions •  Profile on target device •  Editor != Player •  Platform X != Platform Y •  Managed Memory is not returned to Native Land! For best results…: •  Profile early and regularly 6/9/14 45
  46. 46. Page6/9/14 46 Questions ? marcot@unity3d.com - Twitter: @m_trive

×