# Why should I use a reduction rather than an atomic variable?

• A+
Category：Languages

Assume we want to count something in an OpenMP loop. Compare the reduction

``int counter = 0; #pragma omp for reduction( + : counter ) for (...) {     ...     counter++; } ``

with the atomic increment

``int counter = 0; #pragma omp for for (...) {     ...     #pragma omp atomic     counter++ } ``

The atomic access provides the result immediately, while a reduction only assumes its correct value at the end of the loop. For instance, reductions do not allow this:

``int t = counter; if (t % 1000 == 0) {     printf ("%dk iterations/n", t/1000); } ``

thus providing less functionality.

Why would I ever use a reduction instead of atomic access to a counter?

Performance

Because an atomic variable comes with a price, and this price is synchronization. In order to ensure that there is no race conditions i.e. two threads modifying the same variable at the same moment, threads must synchronize which effectively means that you lose parallelism, i.e. threads are serialized.

Reduction on the other hand is a general operation that can be carried out in parallel using parallel reduction algorithms. Read this and this articles for more info about parallel reduction algorithms.

### Addendum: Getting a sense of how a parallel reduction work

Imagine a scenario where you have `4` threads and you want to reduce a `8` element array A. What you could do this in 3 steps (check the attached image to get a better sense of what I am talking about):

• Step 0. Threads with index `i<4` take care of the result of summing `A[i]=A[i]+A[i+4]`.
• Step 1. Threads with index `i<2` take care of the result of summing `A[i]=A[i]+A[i+4/2]`.
• Step 2. Threads with index `i<4/4` take care of the result of summing `A[i]=A[i]+A[i+4/4]`

At the end of this process you will have the result of your reduction in the first element of `A` i.e. `A` 