Table Of Content1 Introduction Part I Fundamental Concepts 2 Heterogeneous data parallel computing 3 Multidimensional grids and data 4 Compute architecture and scheduling 5 Memory architecture and data locality 6 Performance considerations Part II Parallel Patterns 7 Convolution: An introduction to constant memory and caching 8 Stencil 9 Parallel histogram 10 Reduction And minimizing divergence 11 Prefix sum (scan) 12 Merge: An introduction to dynamic input data identification Part III Advanced patterns and applications 13 Sorting 14 Sparse matrix computation 15 Graph traversal 16 Deep learning 17 Iterative magnetic resonance imaging reconstruction 18 Electrostatic potential map 19 Parallel programming and computational thinking Part IV Advanced Practices 20 Programming a heterogeneous computing cluster: An introduction to CUDA streams 21 CUDA dynamic parallelism 22 Advanced practices and future evolution 23 Conclusion and outlook Appendix A: Numerical considerations
SynopsisProgramming Massively Parallel Processors: A Hands-on Approach shows both students and professionals alike the basic concepts of parallel programming and GPU architecture. Concise, intuitive, and practical, it is based on years of road-testing in the authors' own parallel computing courses. Various techniques for constructing and optimizing parallel programs are explored in detail, while case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. The new edition includes updated coverage of CUDA, including the newer libraries such as CuDNN. New chapters on frequently used parallel patterns have been added, and case studies have been updated to reflect current industry practices.