Part 1: An Introduction to Multi-Threading Concepts
In this section we discuss the most important issues and concepts that
relate to multi-threaded and parallel programming environments. The results
are of general applicability to both shared memory and distributed memory
applications as well as to well-known programming languages such as
C++, Java and C#.
Memory Systems
- Shared memory parallel computers (SMPs)
- Shared and cache memory
- Shared memory consistency models
- Distributed memory and shared distributed memory
Threads
- What is a thread?
- Thread attributes
- Thread execution lifecycle
- User threads and kernel threads
Data Access in Threads
- Fork-join (master/slave) model
- Shared and private data
- Thread synchronization
Synchronisation in Detail
- Mutual exclusion (mutex) and condition variables
- Critical sections
- Memory synchronization and fences
- Barriers
Troubleshooting
- Sequential consistency
- Removing data dependencies
- Race conditions
- Deadlock and livelock
Part 2: Parallel Design Techniques
This section discusses how to design software systems
that will run in a multi-processor environment. In this case the traditional
system development methods (for example, object-oriented design) must give way
to design methods that support the parallel nature of the problem. To this end,
we introduce data and task decomposition
techniques to help the designer
to partition the problem into independent subsystems and assign them to
appropriate processors.
Introduction
- Parallel architecture types
- Flynn's taxonomy
- SIMD (SPMD) and MIMD architectures
- Amdahl's and Gustafson's laws
- Speedup
Decomposition Techniques
- Task and data decomposition
- Grouping and ordering tasks
- Data sharing among tasks
- Evaluation
Algorithm Structure
- Task and data parallelism
- Divide and conquer
- Geometric decomposition
- Other decomposition techniques
Parallel Design Patterns
- SPMD pattern
- Master/Worker pattern
- Loop parallelism pattern
- Shared data and shared queues patterns
Part 3: OpenMP Core Techniques
This section discusses the OpenMP library and what it has to offer when we
wish to implement parallel software systems. We introduce the most important
pragmas, library functions and environment variables that are the building blocks
for multi-threaded and parallel applications.
Overview
- Compiler directives
- Library routines
- Environment variables
My First OpenMP Program
- Writing the serial program
- Determining parallel code
- Adding OpenMP directives
- Debugging and performance measurement
Data Clauses in OpenMP
- Shared and private
- Lastprivate, firstprivate
- Default and nowait clause
OpenMP Synchronisation Constructs
- Barrier
- Ordered
- Critical and Atomic
- Locks, Master construct
Work Sharing in OpenMP
- Loop construct
- Sections and section
- Single construct
- Combined parallel work-sharing constructs
Other Clauses
- Reduction clause
- Copyin clause
- Copyprivate clause
- Ordered clause
Configuration and Run-Time Information
- Setting environment variables' values
- Library functions for thread information
- Scheduling functions
- Lock functions
- Timing functions
Part 4: Applications and Performance Measuring
In this section we show how to integrate the
techniques from the first three parts of the course in order to implement
robust, correct and efficient parallel applications. We discuss
loop optimization, troubleshooting OpenMP and developing applications.
Troubleshooting in OpenMP
- Common problems
- Race, shared and private variables
- Work scheduling assumptions
- Side-effects; the need for thread safety
Advanced Problems
- Memory consistency problems
- Using flush
- Deadlock and livelock situations
Debugging
- Verification of the serial program version
- Verification of the parallel program version
- Using tools
Applications, Demos and Discussion
- Monte Carlo simulation
- Matrix algebra and solving linear systems
- Sorting Finite Difference method