Computer Science 220
Assembly Language & Computer Architecture

Fall 2010, Siena College

Lab 2: Decoding MIPS Instructions
Due: 10:00 AM, Friday, October 1, 2010

As you have seen in class and in the readings from Patterson and Hennessy, MIPS machine instructions are encoded as 32-bit values. Each 32-bit value representing an instruction is viewed as a collection of fields that determine the instruction's operation and its operands. Your task for this week's lab assignment is to implement a MIPS instruction decoder in C.

You may work alone or with a partner for this lab assignment.

The first few sections of this lab handout contains some examples that you should work through during lab. Once you have completed those, you can move on to the C programming assignment. However, even if you have not gone through the examples (maybe you're in the Friday lab), you can start thinking about how to solve the problems and you can see how the program works by running my version.

Another C Example: Matrix-Matrix Multiplication

Let's consider a slightly larger example: matrix-matrix multiplication.

See Example:
~jteresco/shared/cs220/examples/matmult

This is another example of separate compilation - The function in timer.c will be useful if we want to measure execution times. We tell matmult.c about it with the line

#include "timer.h"

This provides a prototype of the function in timer.c. In many cases, this file would also define any data structures, constants, or macros used by the functions it defines.

This is a good model to use as you move forward and develop more complicated C programs. Group related functions into a C source file as you would group methods in a Java class or member functions in a C++ class.

Along those same lines, the include files in angle brackets

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>

specify system-wide header files. By convention (though most compilers don't really make a distinction) system-wide header files are in angle brackets, while your own header files are in double quotes.

Each file can then be compiled separately to create an object file (.o file) from the C source. These object files are all listed at the linking step.

What happens for function diffgettime() at compile time? Link time?

The program uses two system calls: printf() and gettimeofday(). To see how these work, we can look at their man pages:

man printf

to see everything we wanted to know about a particular system call. But if you do this, you might get a man page for a command-line utility called printf instead of the system call printf(). Not what we were looking for. The Unix manual is divided up into sections. The most important of these sections, for our purposes, are Section 1: User Commands, and Section 3: Library Functions. If we don't ask for a section, we get section 1. Since section 1 contains an entry for printf, that's what it produced. To force it to give you the system call manual page, you can use:

man 3 printf

This actually tells it to look in section 3, which contains C library functions. How did I know to look in section 3? Mainly because the printf man page in section 1 told me so, at the bottom under the "See Also" section.

Fortunately, you only need to concern yourself with what section of the manual to use when you look something up that it in more than one section. For example,

man gettimeofday

brings up the man page we want, for the gettimeofday() system call in section 2 (the system calls section) If you see a reference to something like ctime(3) in the "See Also" section of a man page, such as that in gettimeofday()'s man page, that means the ctime() man page is in section 3. I will use this notation as appropriate throughout the semester.

You will find the Unix manual very helpful as we move forward.

So what does gettimeofday(2) do? See the man page and look at the usage in the example program.

Note that when we declare the variables start and stop as struct timeval, this allocates enough space to hold two copies of the fields in a struct timeval. This type of memory allocation does not exist in Java, but is very common in C programming. Like local variables of primitive types, the space allocated for start and stop goes away when the function in which they are declared returns to its caller. We will look more carefully at memory management in the next example.

gettimeofday(2) returns wall clock times. This is the amount of elapsed real time. So if our process is taking turns on the CPU with other processes (see the Operating Systems course) and it is not always running, it continues to accumulate wall clock time, but not CPU usage time. There are also system calls to examine CPU usage time which we may consider later.

Finally, the Makefile is using the GCC compiler (gcc) with the option -O for optimization. If you want to run this with a different compiler or optimization flags, you can change the CC= line in the Makefile.

If we compile and run this program, it reports initialization and matrix multiplication times. Initialization is just filling in matrices a and b. Then we compute the value of each element of c using the dot product of the corresponding row of a and column of b.

How long does it take this program to execute?

A Crash Course in Structures and Memory Management

And here is a somewhat silly example that demonstrates the use of structures and some memory management, but which should show you everything you need to know about memory management for this week's programming task.

See Example:
~jteresco/shared/cs220/examples/ratios

Again, we have a number of C source code (.c) and header (.h) files. We will consider each in turn.

The files gcd.h and gcd.c are the same as the ones you saw in last week's lab examples.

The files ratio.h and ratio.c define a structure and a number of functions that have to do with storing a ratio of two integer values.

In ratio.h, we have the definition of the structure that will hold our ratios:

typedef struct ratio {
  int numerator;
  int denominator;
} ratio;

There are two important things happening here. First, a structure called a struct ratio is defined. It consists of two int values: numerator and denominator. In many ways, these are like the instance variables of a Java class, but there are no access protections (they are not "private" or "protected", but the equivalent of "public". Second, we are giving another (shorter) name to our struct ratio: simply ratio. This is being accomplished by the typedef. In general, a typedef can define a new name for any type:

typedef x y;

would define a new type named y which is just another name for an already-existing type named x.

In our case, the typedef just means we can refer to variables and parameters of type struct ratio as simply ratio.

The rest of the contents of ratio.h defines function prototypes for the functions that will be defined in ratio.c that can be called from elsewhere.

As a whole, the information in ratio.h tells a C source file that would like to work with these ratio structures everything it needs to know to compile.

In ratio.c, the four functions that operate on ratios are defined: create_ratio constructs a new ratio given a numerator and a denominator, add_ratios takes two existing ratios, adds them and constructs and returns a new ratio that represents their sum in lowest terms, reduce_ratio takes an existing ratio and reduces it to lowest terms, and finally, print_ratio takes an existing ratio and prints it in a reasonably nice format.

There are a number of things to consider in these functions. The first two functions return a value of type ratio *. This indicates a pointer to a ratio structure. The last three functions take one or two parameters of this same type, ratio *.

Perhaps the most important thing to note here is how we allocate the memory for these structures. In both create_ratio and add_ratios, we see the line:

  ratio *r = (ratio *)malloc(sizeof(ratio));

This is C's way of doing the equivalent of a Java new operation. This line:

  1. declares a variable r of type ratio *
  2. initializes r to the return of the function malloc
  3. malloc reserves a chunk of memory of the requested number of bytes and returns a pointer to the start of the memory segment
  4. the sizeof operator determines the number of bytes in the type to which it applies - in this case ratio, which should be a total of 8 bytes
  5. since malloc does not return a ratio * (it returns a void *, which is a generic pointer), we need the cast to tell the compiler that we will be treating this newly-allocated chunk of memory as a ratio *

Note also the way we refer to the fields of the ratio structure when the variable r contains a pointer to a ratio:

  r->numerator = numerator;

This is functionally the equivalent of the Java statement:

  r.numerator = numerator;

However, since C allows a variable referring to a structure to be either a pointer or the structure itself, there are two different notations. If we had a variable r of type ratio rather than ratio *, we would use the "dot" notation like we use in Java. But here, since we have pointers, we use the "arrow" notation.

A very important difference between C and Java is that memory management in C is not garbage collected. That means that every chunk of memory we obtain with malloc must be returned to the system for reuse by a call to the function free. In our case, these free calls are made in ratios.c. For each call to create_ratio or add_ratios, which each contain a call to malloc, there must be a corresponding call to free.

This brings us to the file ratios.c, which is a main function that makes use of the ratio structure and functions to demonstrate the complexities of C memory management.

Read over the comments in ratios.c and see if you can understand how the memory is being allocated and managed. When everyone gets to this point during lab, we will go through the example and draw a representation of the memory as it evolves during the execution of main.

Lab Assignment: Decoding MIPS Instructions

Your program should read in MIPS instructions as hexadecimal numbers from standard input. Each instruction (which you should read into an unsigned int) is then to be passed to a function that will parse the instruction bit fields and return a structure that contains these values in a more convenient format. It should then call another function that prints out the values of the fields.

Getting Started

A starter framework for this lab assignment can be found on the CS Lab Linux systems in

~jteresco/shared/cs220/labs/mipsdecode/mipsdecode.tar.gz

Copy that file to your account, extract it to an appropriate work area, and change to that directory. To extract it, you should be able to issue the command

tar xvfz mipsdecode.tar.gz

This will create a new directory mipsdecode in your current directory. When you change to that directory, you should find five files:

Your task is to fill in the missing implementations of the functions parseMipsInstruction and dumpMipsInstruction in mipsInstruction.c, and main in driver.c.

My Implementation

I have provided .o files that you can link with to try out my implementations of driver.c and/or mipsInstruction.c. See the Makefile for details. Your program's behavior should be identical to that of mine when you have completed the assignment.

Main Tasks

Your main tasks to complete this lab:

  1. Develop the main input loop in main().
  2. Write parseMipsInstruction().
  3. Write dumpMipsInstruction().

Your code should compile with no warnings when using the -Wall flag to gcc (as is used by the provided Makefile). You should be careful to free any allocated memory at an appropriate time.

Submission and Evaluation

I will test your program on a larger input set than that provided in the instructions file. It is up to you to make sure you test all cases that may not be covered in that file.

The lab will be graded out of 30 points.

By 10:00 AM, Friday, October 1, 2010, submit documented source code for your program. All necessary files should be submitted by email to jteresco AT siena.edu. Please submit a single tar file called mipsdecode.tar that contains four files: Makefile, driver.c, mipsInstruction.h, and mipsInstruction.c. Make sure your name is in the C files that you modified. Please do not include any additional files, such as emacs backup files, object files, or your executable.

Grading Breakdown

Correctness 20 points
Efficiency 4 points
Style and Documentation 6 points

The efficiency portion of the grade may be impacted, for example, if you use unnecessarily complex functions or operations to break down the input number into fields. Good style requires appropriate code formatting and the use of meaningful names for variables, structures, structure fields, and functions. A well-documented submission should include a comment in each file describing the contents of the file, a comment with each function describing what it does, what parameters it expects, and any restrictions on its usage, and comments clarifying any potentially confusing code segments.