Lecture 17

Lecture 17 - Disk Scheduling, Files

Brain:	Here we are, Pinky - at the dawn of time!
Pinky:	Narf, Brain. Wake me at the noon of time.
	- Pinky and the Brain

Agenda

Announcements
Disk Scheduling
File System Interface: Files
File System Interface: Directories
File System Interface: Disks and Partitions
File System Implementation

Disk Scheduling Algorithms

We can minimize seek time by minimizing the distance the read/write head has to move in order to service the incoming requests.

Given a sequence of cylinders that must be visited to service a set of pending disk read/write requests, the system can order the requests to minimize seek time.

This may be done by the disk, the hardware controller, or by the operating system.

We will compare algorithms by examining their performance on a given request queue.

Given a disk with 200 cylinders (0-199), suppose we have 8 pending requests:

98, 183, 37, 122, 14, 124, 65, 67

and that the read/write read is currently at cylinder 53.

First-Come First-Served (FCFS)

We include this analog of FCFS CPU scheduling or FIFO page replacement mainly for comparison purposes.

Requests are serviced in queue order, for a total of 640 cylinders of movement.

Shortest Seek Time First (SSTF)/Closest Cylinder Next

Service the request next that has the shortest movement from the current position.

This is the analog of SJF CPU scheduling and OPT page replacement, but unlike those, it's actually possible, since we do have our actual request queue available to us.

In our example:

65, 67, 37, 14, 98, 122, 124, 183

The total seek distance is 236 cylinders.

Potential problem: if many requests keep arriving near where the disk head is positioned, distant requests may be starved.

SCAN or Elevator Algorithm

When an elevator is going in one direction, it stops at all the floors where there is a pending request. Then it reverses direction and does the same thing.

With this algorithm, the disk arm does just this. Service requests in one direction, then reverse direction.

In our example, assuming we are "going down" at the start:

37, 14, (0), 65, 67, 98, 122, 124, 183

236 cylinders again. It is a coincidence that this is the same as SSTF.

Note that the disk arm went all the way to 0, even though there were no requests below 14. This is because this particular algorithm doesn't look ahead, it just moves back and forth from one end to the other.

We can take care of that extra movement down to 0 with ...

LOOK Algorithm

It's the same as SCAN, but the head reverses direction as soon as there are no pending requests in the current direction.

The movement is the same as SCAN, just without that move from 14 to 0 and then up to 65. This reduces the movement to 208 cylinders.

Both SCAN and LOOK can lead to non-uniform waiting times. A request near one end of the disk sometimes needs to wait for two sweeps across the disk, while other times it will be serviced very quickly. Requests near the middle have a more uniform average waiting time.

Circular Algorithms

This problem can be addressed using circular versions of SCAN (C-SCAN) and LOOK (C-LOOK), where when the disk arm gets to the end of the disk, it jumps immediately back to the other end.

Assuming the disk services requests only when "going up", our example using these algorithms are served in order:

65, 67, 98, 122, 124, 183, 14, 37

With C-SCAN, the head goes all the way to 199 and all the way back to 0, giving total movement of 382. With C-LOOK, we do not need to go up past 183 or down past 14, making the movement total 322.

The penalty of the movement all the way back in the other direction may not be as large as it seems. Think of the mechanics of the situation - starting and stopping the disk arm takes more time than simply sweeping all the way across with just one acceleration and deceleration.

Comparing Disk Scheduling Algorithms

SSTF or LOOK are often reasonable for a default algorithm

SCAN and C-SCAN are better for heavily loaded systems where LOOK is unlikely to save much and SSTF runs the risk of starvation

performance depends on the frequency and types of requests

we may want to consider some of this when we look next week at how to organize file systems

FreeBSD's ufs filesystem (the one we use in the lab) uses an elevator algorithm. Here is the comment at the top of file /sys/ufs/ufs/ufs_disksubr.c:

/*
 * Seek sort for disks.
 *
 * The buf_queue keep two queues, sorted in ascending block order.  The first
 * queue holds those requests which are positioned after the current block
 * (in the first request); the second, which starts at queue->switch_point,
 * holds requests which came in after their block number was passed.  Thus
 * we implement a one way scan, retracting after reaching the end of the drive
 * to the first request on the second queue, at which time it becomes the
 * first queue.
 *
 * A one-way scan is natural because of the way UNIX read-ahead blocks are
 * allocated.
 */

File System Interface

We switch focus now to talk about how to organize information on disks.

Hopefully everyone has a good idea what we mean by a file.

files can be data or programs

can be simple or complex (plain text, or a specially-formatted file)

structure of a file is determined by both the OS and the program that creates it

files are stored in a file system

disk

tape

memory

Files have a number of attributes:

filename

file type (maybe)

location - where is it on the device

size

protection/permissions (maybe)

timestamp, ownership

directory information

A number of operations can be performed on files, many of which we have been doing without giving it miuch thought:

create

write/append

read

seek (reposition within file)

delete

truncate

open, close

File Types

How does a "type" get assigned to a file?

can use file extension (.c, .exe, .doc, .tex, .mp3, etc.)

file extension may or may not be functionally important - even if not, they give the OS and the user an hint as to the type of the file

Windows uses an unenforced file extension registration

Macintosh can enforce types within a file - special part of a file called a "resource fork" to store extra information including the application that created ir

Unix uses "magic numbers". See file command and /usr/shared/misc/magic in FreeBSD, /etc/magic in Solaris.

File access may be sequential or direct. Tapes support only sequential access, disk files may support both.

Directories

A listing of the files on a disk is a directory.

Both the directory structure and the files reside on the disk.

The directory may store some or all of the file attributes we discussed.

A directory should be able to support a number of common operations:

search for a file

create a file

delete a file

rename a file

listing of files

filesystem traversal (cd)

Directory Organization

There are many ways to organize a directory, with different levels of complexity, flexibility, and efficiency. We will look at several possibilities.

Single-Level Directory

The most simple method is to have one big list of all files on a disk.

This can be used for a simple system. A disk for the C-64 worked like this.

But, it breaks down pretty quickly:

cannot have two files with the same name - could be necessary for multiple users/programs on a disk

no way to group files - just one big list

searches need to look through the entire directory

Two-Level Directory

We can create a separate directory for each user:

files now have a path name /user1/have

different users can have the same file name (/user2/me and /user3/me)

searching is more efficient, as only one user's list needs to be searched

but still no grouping capability for a user's files

Tree-structured Directory

Something more reasonable and useable:

any directory entry can be either a file or a subdirectory

files can be grouped appropriately

search is more efficient - follow the path

add concept of a current working directory

traverse directory (cd)

operate on files in current directory by default

or specify path, either relative or absolute

need to be able to create and remove directories as well as files

consider: what happens when a non-empty directory is deleted?

Acyclic-Graph Directories

The tree model does not allow the same file to exist in more than one directory. We can provide this by making the directory an acyclic graph.

Two or more directory entries can point to the same subdirectory or file, but (for now) we restrict it to disallow any directory entry pointing "back up" the directory structure.

These kinds of directory graphs can be made using links in the Unix world, shortcuts in the Windows world, or aliases in the Mac world.

We have multiple names for and multiple paths to the same file.

More next time...

Agenda

Announcements

Disk Scheduling Algorithms

File System Interface

Directories