Skip to content

zxm5010/Project-1

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PART 4 : PERFORMANCE ANALYSIS

  • How does changing the tile and block sizes change performance? Why?

Changing the tile and block sizes may increase the speed if we can utilize the shared memory efficiently. Smaller the block size, smaller the shared memory threads can have. If threads frequently access global memory instead of shared memory, then the performance go down. For Nbody problem, we actually calculated forces between two paired body twice. If we can store them into shared memory, then the performance can be doubled.

  • How does changing the number of planets change performance? Why?

As we can see from the result, there is only 1.5 FPS for 5000 bodies, and 60FPS for 100 bodies. Performance decreses in O(N2) if the number of planets increases. The main reason is the accelerate() function has a for loop which integrates all forces from other bodies serially.To increase the performance, we can try to make the additions also in parallell.

  • Without running experiments, how would you expect the serial and GPU verions of matrix_math to compare? Why?

GPU version of matrix_math runs faster than serial computation. Since the computation for each element in the matrix is independent, all computations can be done parallely, which means those computations can be done simultaneously rather than one-by-one in CPU.

About

Introduction to CUDA kernels

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 62.2%
  • C 37.3%
  • Other 0.5%