Computer Science 432
Operating Systems

Williams College
Fall 2006

Lecture 24: Distributed Systems, Virtualization
Date: December 5, 2006

Agenda

Announcements
- Almost there
Exam 2 Due
Final projects
- paper drafts due Friday so I can get them back to you on Monday
- presentations on Monday afternoon
- final paper, code, demos, etc due next Friday
Distributed Systems (see notes from last time)
Parallel File System: IBM GPFS
- Consider a cluster environment - lots of nodes all trying to read files from or write files to the same partition, concurrently
- Even with a high-performance file system and a fast network, file I/O can be a significant bottleneck
- Idea: parallel file system - take disks attached to many or all cluster nodes and manage them as one large partition
- Many issues to consider:
  - virtual file system layer will allow interoperation with standard file system
  - files are stored on individual disks - store entire files on one node, or striped across multiple disks as in a RAID?
  - what about partition organization information (superblocks, root directory structures, inode lists, free block lists)? strictly distributed? replicated?
  - replicate files? cache locally?
  - if a file is created on one node, is it created on the disk local to that node?
  - if a file is repeatedly accessed on another node, should it migrate?
  - actual implementation stores strips of files across disks, tries to keep usage balanced, allows for a level of replication
  - administrator can adjust layout of files among disks/nodes
Virtualization
- Examples: Java VM, Emulators
- Hardware is now fast enough to host several operating systems simultaneously
- Several projects aim to allow this: Xen, VMWare, Parallels
- Motivations
  - a single user would like to run multiple OSs simultaneously on an individual system without need to reboot
  - a single user would like to experiment with an OS - boot it into a virtual machine instead of on the raw hardware
  - want several server systems, each with its own OS where the "owner" of the server has administrative access, but each of these servers is not demanding enough to make full use of available resources - combine into a smaller number of virtual servers that are isolated from each other (each is running its own copy of the OS) but share physical hardware
  - multi-core systems only make this more desirable
- One approach: have one "master" operating system and other ("guest") systems boot within a process in the host system
  - each guest OS is presented a virtual hardware interface by the master
  - guests share physical resources like regular processes share the resources when running on the master
- Currently popular appraoch: "master" system is very small, only provides virtual hardware interfaces to guest systems and manages resources (CPU, memory, I/O) among guests
- Xen: an open source virtualization project
  - Xen calls the master and guest OSs "domains"
  - master domain is the "hypervisor"
  - hypervisor starts first, guests are then added
  - uses paravirtualization - not a strict virtual x86 presented to guests, but something more convenient
  - paravirtualization requires some changes to guest OS
  - full virtualization (as in VMware) allows unmodified guest OS to run
  - however, with paravirtualization, user processes still run unmodified
- Important issues to consider:
  - manage sharing of resources among guests
  - make scheduling decisions as quickly as possible - useful work can only be done if one of the guests is executing on the CPU
  - manage I/O and interrupts: make sure the hypervisor can preempt guests, make sure the guests can still use interrupts for their own resource scheduling
  - timers: when a guest OS needs a timer (such as for a scheduling quantum) does it want wall-clock time or virtualized time (is time suspended for the OS when some other OS is on the CPU?) - both make sense in some contexts
  - provide performance as close as possible to OS directly on the hardware - minimize cost of the extra layer of abstraction - but maintain safe separation of guest OSs
  - most OSs assume they have highest privilege, but Xen on x86: hypervisor uses protection ring 0, guest OS kernels use rings 1 and 2 (they may be used to using 0), guest OS user-level processes can still use 3
  - privileged instructions are replaced by paravirtualized "hypercalls"
  - memory management is messy: need to manage page tables, TLBs, etc, and need privileged instructions to modify page table hardware, etc - need to trap to hypervisor for much of this
  - memory allocation among guests - initial "reservation" for each, but not contiguous
  - network: each guest generates a "fake" MAC address, mapped to an IP address
  - exceptions, faults, traps are straightforward
  - I/O is done through simplified device abstractions and details are handled at the hypervisor level
  - porting Linux to the paravirtualized machine involed 2995 lines of code change, or 1.36% of the kernel source code
- Many more details in Xen and the Art of Virtualization linked from Xen page below
- Xen feature: can do "live migration" of running operating systems among hosts

Computer Science 432 Operating Systems

Williams College Fall 2006

Agenda

Links

Computer Science 432
Operating Systems

Williams College
Fall 2006