situation: a collection of computers connected by a network,
which allows for
resource sharing
share files
remote hardware devices: tape drives, printers
computational speedup
reliability - if a node fails, others are available while
repairs are made
The OS for a distributed system could be:
Network Operating System - users are aware of multiple
nodes - access resources explicity:
remote login to appropriate machine (telnet, rlogin, ssh)
transferring files explicitly (ftp, scp)
just use standard OS with network support
simpler, more burden on the user
Distributed Operating System - users need not know
where things are on the network
system handles the transparency
data migration - automated file transfer (scp), file sharing
(NFS, AFS, Samba)
computation migration - transfer computation across the
system
remote procedure call (RPC) - a process on one node makes a
function call (essentially) that runs on a different node
process migration - execute an entire process, or parts of
it, on different nodes
more complex, but more automated
Design Issues for A Distributed OS: Transparency
access transparency - ability to access local and remote
resources in a uniform manner
location transparency - users have no awareness of object
locations
migration transparency - object can be moved without
changing name or access method
concurrency transparency - share objects without
interference
replication transparency - consistency of mutiple
instances (or partitioned) files and data
parallelism transparency - parallelism without need to
have users aware of the details
failure transparency - graceful degradation rather than
system disruption, minimize damage to users - fault tolerance
performance transparency - consistent and predictable
performance when the system structure or load changes
size transparency - system can grow without users'
knowledge - scalability - difficult issue, as bottlenecks may
arise in large systems
Distributed File Systems
share disk files over a network
Terminology:
Service - software running on one or more
machines providing some function to other machines
Server - service software running on a single machine
Client - process that can invoke a service
Client interface - set of client operations
provided by a service
For a DFS, the client interface most likely includes a set
of file operations, much like those available to access local
disks (create, delete, read, write)
DFS Issues:
remote and local files should be accesible through the
same client interface
performance is important (access time, service time,
latency)
naming - file names need to have enough information to
find the location
replication - are there replicas of some or all files?
How are they kept synchronized?
caching - use local disks to cache remote files, use
memory cache on client or server
Naming Schemes
location transparency - file name does not reveal
the file's physical storage location
location independence/migration transparency - file name
remains the same when the physical location changes
Approach 1: file names include a location and a path
bull:/home/faculty
\\ntserver\path\to\file
no location transparency or independence here
Approach 2: remote directories are attached to local
directories, in much the same way local filesystems are included
Sun Network File System (NFS) (more soon)
Windows "attach network drive"
Mac "connect to server"
File names do not include the server name, but the server name
is needed to make the initial connection of "mount"
Approach 3: total integration of file systems - a single
global name structure spans all files
means to share filesystems (or part of filesystems) among
clients transparently
a remote directory is mounted over a local file system
directory, just as we mount local disk partitions into the directory
hierarchy
mounting requires knowledge of physical location of files -
both hostname and local path on the server
usage, however, is transparent - works just like any filesystem
mounted from a local disk
mount mechanism is separate from file service mechanism; each using
RPC (remote procedure call)
interoperable - can work with different machine architectures,
OS, networks
mount mechanism requires that the client is auhorized to connect
(see /etc/exports in most Unix variants, /etc/dfs/dfstab
in Solaris) - mountd process services mount requests
when a mount is completed, a file handle (essentially just
the inode of the remote directory) is returned that the client OS can
use to translate local file requests to NFS file service requests -
nfsd process services file access requests
NFS fits in at the virtual file system (VFS) layer in the OS -
instead of translating a path to a particular local partition and file
system type, requests are converted to NFS requests
NFS servers are stateless - each request has to provide all
necessary information - server does not maintain information about
client connections
two main ways for clients to know what to request and from
where: entries in file system table (/etc/fstab or /etc/vfstab), or automount tables (see /etc/auto_* on bullpen,
for example). fstab entries are mounted when the system comes up,
automount entries are mounted on demand, and are unmounted when not
active
many NFS implementations include an extra client-side process,
nfsiod, that acts as an intermediary, allowing things like
read-ahead and delayed writes. This improves performance, though the
delayed writes add some danger (see below).
Caching Remote File Access
Caching is important - can use the regular buffer cache on
the client, could use client disk space as well. Server automatically
will use its buffer cache for NFS requests as well as any local
requests
Advantages of caching on disk:
reliability (non-volatile)
can remain across reboots
can be larger
Advantages of caching in memory:
can be used for diskless workstations
faster
already have memory cache for local access, and
putting a remote file cache there allows reuse of that
mechanism
Cache Update Policy - mostly the same issues we have seen in
other contexts
Write-through - write data through to disk as soon as
write call is made - reliable, but performance is poor
Delayed-write - write to cache now, to server
"later" - much faster write, but dangerous!
Cache Consistency: need to know if the copy of a disk block in
our cache is consistent with the master copy
client-initiated: client that wants to reuse a block
checks with the server
server-initiated: server notifies clients of any changes
from other processes
Stateless vs. Stateful File Service
Replication
reliability/availability: one server goes down, use
another that has a replica
efficiency: use the closest or least-busy server
technique used by web servers - a request for a file at
www.something.com may be silently redirected to one of a number
of servers, like www2.something.com, www28.something.com, etc.
main issue: keeping replicas up to date when one or more
is changing
if there is a "master copy" we can use caching ideas -
just treat replicas like cached copies
if there is no master, any change must be made to all
replicas
AFS - a globally distributed file system.
we saw the naming convention that includes a site name in the path
use of a file cache on local disks - important as network
latency is now (potentially) over the internet
the system caches entire files locally, not individual disk
blocks
file permissions are now very important, as many users can
browse - AFS supports more complicated file permissions,
including ACLs
files can move among servers in the same cell without its
name changing