11. GPU: Optimized for Throughput Use much simpler cores Use vectorization to replicate simple cores Control (Fetch / Decode) Control (Fetch / Decode) ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU ALU Execution Context (Registers) Execution Context (Registers) Execution Context (Registers) Execution Context (Registers) Execution Context (Registers) Execution Context (Registers) Execution Context (Registers) Execution Context (Registers) Execution Context (Registers) Execution Context (Registers) Execution Context (Registers) Shared Execution Context Courtesy to K. Fatahalian
12. Take with a Grain of Salt Raw Compute Power != Application Performance Not all applications are suitable for GPUs Developing fully optimized codes on GPU is non-trivial and requires computational rethinking A GPU core is MUCH SLOWER than a CPU core Need a lot of parallelism to hide memory latency Reduce branching as much as possible Think about an army of synchronized snails
13. GPU Potential for Sequence Alignment Why sequence alignment? Fundamental in sequence analysis Computationally intensive Preliminary study
14. Lessons Learnt CPU optimized code may be difficult to accelerate on GPUs BLASTP 6.5x vs. Smith Waterman 30x Require rethinking of algorithm design Scalable but less optimal algorithm is better Example: RMAP Originally uses hash table to find the match (O(n)) Switched to a slower binary search algorithm (O(nlogn))
16. Compute the Cure Initiative Partnership between NVIDIA and VT Goal: Leverage GPU power to fight cancer Current focus: GPU accelerated sequence alignment framework http://www.nvidia.com/object/compute-the-cure.html
17. Conclusion Democratizing DNA sequencing requires more accessible HPC resources GPUs present both opportunities and challenges Initial results are promising For more information Synergy website – http://synergy.cs.vt.edu
18. Acknowledgement Collaborators David Mittelman, Virginia Bioinformatics Institute Students AshwinAji Shucai Xiao Funding NVIDIA Compute the Cure Program NSF Center for High-Performance Reconfigurable Computing