Lecture 1 - Introduction and Overview
Welcome to CS 338, Parallel Processing.
- Announcements
- Syllabus and Course Ground Rules
- Why Parallel Computing?
- Some Basics
- Computer Science Colloquium resumes this week with two senior
honors thesis progress reports: Jeremy Redburn and Shimon Rura.
Senior majors have to be there, the rest of you are encouraged to come
along.
Please read the syllabus carefully. If there are questions, ask!
Let's consider a few "real world" examples:
In each of these cases, we have taken what we might normally think of
as a sequential process, and taken advantage of the availability
of concurrent processing to make use of multiple workers
(processing units).
Parallelism adds complexity (as we will see in great detail), so why
bother?
want to solve the same problem but in a shorter time than
possible on one processor
want to solve larger problems than can currently be solved at
all on a single processor
some algorithms are more naturally expressed or organized concurrently
Sequential Program:
sequence of actions that produce a result (statements + variables),
called a process, task, or thread (of control). The state of the program is determined by the code, data, and a single program counter.
Concurrent Program:
two or more processes that work together. Big difference: multiple program counters.
To cooperate, the processes need communication and synchronization, which can be achieved through shared
variables, or message passing.
Hardware to run concurrent processes:
single processor - logical concurrency (see CS 432)
multiprocessor - shared memory
multicomputer - separate memories
network - slower communication
Computers may be classified as:
SISD: single instruction, single data - one processor doing
one thing at a time to one piece of data at a time
SIMD: single instruction, multiple data - multiple processors
all doing the same thing at the same time, but operating on
different data. a.k.a. vector computers. Program operates in
"lock step" on each processor.
MIMD: multiple instruction, multiple data - multiple
processors each doing their own thing.
SPMD: single program, multiple data - not really a
classification of the computer, but of a model used to program a
MIMD computer. Multiple processors run the same program, but do not
operate in lock step. a.k.a. the "interacting peers" model. This
is the model we will use most in this class.
Some examples:
SISD: Squall, Williams College: Macintosh PowerBook, 800 MHz Power PC G4
MIMD: Bullpen Cluster, Williams College: 8-node Sun cluster, total
of 20 450 MHz UltraSparc II processors (plus some older stuff)
MIMD: ASCI Red, Sandia National Labs: 4600+ nodes, each with 2 Intel
Pentium II Xeon processors, first TeraOp machine in 1997
MIMD: ASCI White, LLNL: 512 nodes, each with 16 Power3 Nighthawk-2
processors, 12 TeraOps total, was number 1 until 2002
Hybrid: Earth Simulator, Yokohama Institute for Earth Sciences, Japan:
640-node NEC system, each node with 8 vector processors, total of
5,120 CPUs, peak performance of 40 TeraOps
See http://www.top500.org/.
How to Achieve Parallelism
Need to determine where concurrency is possible, then break
up the work accordingly
Easiest if a compiler can do this for you - take your
sequential program and extract the concurrency automatically. This is
sometimes possible, especially with fixed-size array computations.
If the compiler can't do it, it is possible to give "hints" to
the compiler to tell it what is safe to parallelize.
But often, the parallelization must be done explicitly: the
programmer has to create the threads or processes, assign work to
them, and manage necessary communication.
For part of Homework 1, you will need to use the Bullpen Cluster,
located here in the department. You will need to compile and run a
parallel version of a "hello, world" program that uses the Message
Passing Interface (MPI) to start the processes and for interprocess
communication. We will look at MPI in great detail throughout the
semester.
MPI program demo.