This exercise aims to analyze a program that performs concurrent processing of a table. The idea is to spread
the load between different threads together to save the output in a shared variable. This variable is protected by a mutex.
The basic idea is processing a long data structure and divide the work into several threads:
In the example given there are 4 threads, each of which has a quarter of the table and stores the result
in the shared variable called sum.
When all threads have finished execution and leave the do_work function, the main
function ends.
The threads share a shared variable sum used by all threads. The threads read and write using
a mutex for it. This is necessary to synchronize reading and writing operations in threads.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | /******************************************************************************
* compile with gcc -pthread *.c -o loops
* test with valgrind --tool=helgrind ./lops
*
******************************************************************************/
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#define NTHREADS 4
#define ARRAYSIZE 100000000
#define ITERATIONS ARRAYSIZE / NTHREADS
double sum=0.0;
double a[ARRAYSIZE];
pthread_mutex_t sum_mutex;
void *do_work(void *tid)
{
int i, start, *mytid, end;
double mysum=0.0;
/* Initialize my part of the global array and keep local sum */
mytid = (int *) tid;
start = (*mytid * ITERATIONS);
end = start + ITERATIONS;
printf ("\n[Thread %5d] Doing iterations \t%10d to \t %10d",*mytid,start,end-1);
for (i=start; i < end ; i++) {
a[i] = i * 1.0;
mysum = mysum + a[i];
}
/* Lock the mutex and update the global sum, then exit */
pthread_mutex_lock (&sum_mutex);
sum = sum + mysum;
pthread_mutex_unlock (&sum_mutex);
pthread_exit(NULL);
}
int main(int argc, char *argv[])
{
int i, start, tids[NTHREADS];
pthread_t threads[NTHREADS];
pthread_attr_t attr;
/* Pthreads setup: initialize mutex and explicitly create threads in a
joinable state (for portability). Pass each thread its loop offset */
pthread_mutex_init(&sum_mutex, NULL);
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
for (i=0; i<NTHREADS; i++) {
tids[i] = i;
pthread_create(&threads[i], &attr, do_work, (void *) &tids[i]);
}
/* Wait for all threads to complete then print global sum */
for (i=0; i<NTHREADS; i++) {
pthread_join(threads[i], NULL);
}
printf ("\n[MAIN] Done. Sum= %e", sum);
sum=0.0;
/* for (i=0;i<ARRAYSIZE;i++){
a[i] = i*1.0;
sum = sum + a[i]; }
printf("\n[MAIN] Check Sum= %e",sum);
*/
/* Clean up and exit */
pthread_attr_destroy(&attr);
pthread_mutex_destroy(&sum_mutex);
pthread_exit (NULL);
}
|
Let's illustrate in a practical way the neccesity of a mutex and also to improve the quality of the code (removing global variables):
Revise the code, compile it with gcc and run it. Calculate the runtime with 1 thread, 2 threads, 4 and 8 threads. Do you notice any difference between each of these cases?
Verify that the code has not any concurrent anomalies. You can use the following command to do it: valgrind --tool=helgrind ./multi_thread_loop_mutex
Remove the mutex and check your code as the Valgrind tool detects the problem. It should point out a "race condition" problem.
Modify the example starting for removing all global variables in the code. That will make the code more portable because they reduce complexity in code.