We started to discuss OpenMP last week.
Please read this OpenMP Tutorial from Lawrence Livermore National Labs.
The official OpenMP web site is http://www.openmp.org.
Functionally, OpenMP programming is similar to pthreads programming, but OpenMP puts some more of the burden on the compiler.
A few things to mention (or recap):
Last time, we looked at an example of an OpenMP matrix-matrix multiply, where we used a #pragma omp parallel for directive to let OpenMP parallelize our for loop.
We can also take more control, similar to the way we did with pthreads:
matmult_omp_explicit.tgz Also available in /home/faculty/terescoj/shared/cs338/lect06.
Things to note:
matmult_omp_bagoftasks.tgz Also available in /home/faculty/terescoj/shared/cs338/lect06.
Things to note:
A parallel directive can take a number of clauses to define how variables are to be treated.
Any previous value is not seen by the threads, and that value is still there when the parallel block ends.
openmp_private.tgz Also available in /home/faculty/terescoj/shared/cs338/lect06.
Any previous value is seen by all threads, and any changes made by threads will persist when the parallel block ends.
openmp_shared.tgz Also available in /home/faculty/terescoj/shared/cs338/lect06.
openmp_reduction.tgz Also available in /home/faculty/terescoj/shared/cs338/lect06.
What is a reduction? Basically, the operator is applied to combine the given variable's value in each thread, and the overall result is stored in the variable when the parallel block returns.
We'll see many examples when we talk about message passing. Here's one that's a little more interesting than the made up example above:
matmult_omp_explicit2.tgz Also available in /home/faculty/terescoj/shared/cs338/lect06.
There are several other directives worth looking at a bit:
Define sections of code (that aren't a loop) that can be executed concurrently. An overly simplistic example:
openmp_sections.tgz Also available in /home/faculty/terescoj/shared/cs338/lect06.
Each defined section is a block that can be assigned to a thread.
This is useful when we have different tasks to assign to each thread created.
Used within a parallel block, this specifies that the block inside the single should be executed by exactly one thread.
This is a lot like single, but we are guaranteed that the master thread does the execution.
We've seen this - it defines a critical section.
Used within a parallel block, this causes the threads to synchronize at this point. This could be used, for example, to make sure that the threads all complete some preliminary computation before moving on to their next step.
Force a simple statement that modifies a single variable to be atomic. It is essentially a critical section, but since it is more restrictive, the compiler may choose more efficient techniques.
Yet another matrix-matrix multiply example that uses some of these:
matmult_omp_explicit3.tgz Also available in /home/faculty/terescoj/shared/cs338/lect06.
OpenMP has lock variable type and associated functions to operate on them that work a lot like pthread mutex. Often, the critical directive can be used, but these are more general.