2 Matching Annotations
- Dec 2017
When threads inside a warp branches to different execution paths. Instead of all 32 threads in the warp do the same instruction, on average only half of the threads do the instruction when warp divergence occurs. This causes 50% performance loss.
make as many consecutive threads as possible do the same thing
an important take-home message for dealing with branch divergence.