paper drafts due Friday so I can get them back to you on Monday
presentations on Monday afternoon
final paper, code, demos, etc due next Friday
Distributed Systems (see notes from last time)
Parallel File System: IBM GPFS
Consider a cluster environment - lots of nodes all
trying to read files from or write files to the same
partition, concurrently
Even with a high-performance file system and a fast
network, file I/O can be a significant bottleneck
Idea: parallel file system - take disks attached to
many or all cluster nodes and manage them as one large
partition
Many issues to consider:
virtual file system layer will allow interoperation with
standard file system
files are stored on individual disks - store entire
files on one node, or striped across multiple disks as in a
RAID?
what about partition organization information
(superblocks, root directory structures, inode lists, free
block lists)? strictly distributed? replicated?
replicate files? cache locally?
if a file is created on one node, is it created on the
disk local to that node?
if a file is repeatedly accessed on another node, should
it migrate?
actual implementation stores strips of files across
disks, tries to keep usage balanced, allows for a level of
replication
administrator can adjust layout of files among
disks/nodes
Virtualization
Examples: Java VM, Emulators
Hardware is now fast enough to host several operating
systems simultaneously
Several projects aim to allow this: Xen, VMWare, Parallels
Motivations
a single user would like to run multiple OSs
simultaneously on an individual system without need to reboot
a single user would like to experiment with an OS -
boot it into a virtual machine instead of on the raw hardware
want several server systems, each with its own OS where
the "owner" of the server has administrative access, but
each of these servers is not demanding enough to make full use
of available resources - combine into a smaller number of
virtual servers that are isolated from each other (each is
running its own copy of the OS) but share physical hardware
multi-core systems only make this more desirable
One approach: have one "master" operating system and other
("guest") systems boot within a process in the host system
each guest OS is presented a virtual hardware interface
by the master
guests share physical resources like regular processes
share the resources when running on the master
Currently popular appraoch: "master" system is very small,
only provides virtual hardware interfaces to guest systems and
manages resources (CPU, memory, I/O) among guests
Xen: an open source virtualization project
Xen calls the master and guest OSs "domains"
master domain is the "hypervisor"
hypervisor starts first, guests are then added
uses paravirtualization - not a strict virtual
x86 presented to guests, but something more convenient
paravirtualization requires some changes to guest OS
full virtualization (as in VMware) allows unmodified
guest OS to run
however, with paravirtualization, user processes still
run unmodified
Important issues to consider:
manage sharing of resources among guests
make scheduling decisions as quickly as possible -
useful work can only be done if one of the guests is executing
on the CPU
manage I/O and interrupts: make sure the hypervisor can
preempt guests, make sure the guests can still use interrupts
for their own resource scheduling
timers: when a guest OS needs a timer (such as for a
scheduling quantum) does it want wall-clock time or
virtualized time (is time suspended for the OS when some other
OS is on the CPU?) - both make sense in some contexts
provide performance as close as possible to OS directly
on the hardware - minimize cost of the extra layer of
abstraction - but maintain safe separation of guest OSs
most OSs assume they have highest privilege, but Xen on
x86: hypervisor uses protection ring 0, guest OS kernels use
rings 1 and 2 (they may be used to using 0), guest OS
user-level processes can still use 3
privileged instructions are replaced by paravirtualized
"hypercalls"
memory management is messy: need to manage page tables,
TLBs, etc, and need privileged instructions to modify page
table hardware, etc - need to trap to hypervisor for much of
this
memory allocation among guests - initial
"reservation" for each, but not contiguous
network: each guest generates a "fake" MAC address,
mapped to an IP address
exceptions, faults, traps are straightforward
I/O is done through simplified device abstractions and
details are handled at the hypervisor level
porting Linux to the paravirtualized machine involed
2995 lines of code change, or 1.36% of the kernel source code
Many more details in Xen and the Art of Virtualization
linked from Xen page below
Xen feature: can do "live migration" of running operating
systems among hosts