Computer Science 220
Assembly Language & Computer Architecture
Fall 2011, Siena College
During this week's lab meeting, you will learn about instruction set architectures other than MIPS. Be sure you understand everything here - all topics in this lab handout are potential final exam question topics.
You may work individually or with a partner on all parts of this lab.
There are a series of questions you need to answer and turn in by email. You may wish to open an email draft or start editing a document to record your answers as you go. Start by placing your name and the name of your partner (if you have one) in this document.
Addressing Modes
Before we consider specific instruction set architectures, we review and give examples of addressing modes. The addressing mode of of each operand of a machine instruction determines how the instruction finds the data specified by that operand (either source or destination).
Some very common addressing modes include:
We used register indirect with offset addressing for accessing array elements where we had a constant subscript:
lw $5, 16($6) ; load a[4] if $6 is a pointer to a
We also saw this when saving and restoring registers on the stack:
sw $9, -4($sp) ; store $9 on stack at offset -4
Now, suppose we have a C structure (think Java class if C structures upset you too much):
struct ratio { int numerator; int denominator; };
and the following C code, where r is a pointer to one of these structures:
int n; int d; n = r->numerator; d = r->denominator;
There are also some less-common addressing modes worth noting:
For example, an operand for the Motorola 68000 might be specified as:
4(%a4,%d1.w)
%a4 and %d1 are register names. %a4 is the base register, and the %d1.w specifies that the displacement value is the number of word-sized values (which are 2 bytes on the M68K) so it should be multiplied by 2. So the effective address is %a4+2*%d1+4.
This is also available in the Motorola 68000, where is takes the form:
-(%a7)
Again, on the M68K, this looks like:
(%a7)+
These are typically used only with control flow instructions (branches and jumps).
More Complex Instuctions
All MIPS machine instructions are, by design, very limited in what they can do. This is not true of all architectures. Consider some examples from other architectures:
MOVAW DATA,R6 ; load array ptr into R6 MOVL NUM,R9 ; initialize R9 with the number of elts DOUBLE: ADDW2 (R6),(R6)+ ; double entry, increment ptr SOBGTR R9,DOUBLE ; loop control
CMPC5 R5,STRING1,#^A/ /,R7,STRING2
This compares the character string whose starting address is specified by a label STRING1 and whose length is in register R5 to the string labelled STRING2, length R7, using the space character to pad the shorter string for comparison purposes, using registers R0, R1, R2 and R3 to store information about the result of the comparison.
This instruction not only takes a long time, the amount of time it takes depends on the length of the strings being compared!
swap: link %a6,#0 movem.l #0x010f,-(%sp) ... code for swap subroutine that ... ... uses the saved registers ... movem.l (%sp)+,#0xf080 unlk %a6 rts
The hexadecimal constants are bit fields specifying which registers should be pushed or popped. If there is a 1 in a given position, the corresponding register is pushed or popped.
Note that the constants each contain 5 bits set to 1, and that the second is the reverse of the first. This is because popping is done in the opposite order as pushing (it is a stack, after all).
The code above would replace the longer code:
swap: link %a6,#0 move.l %d0,-(%sp) move.l %d1,-(%sp) move.l %d2,-(%sp) move.l %d3,-(%sp) move.l %a0,-(%sp) ... code for swap subroutine that ... ... uses the saved registers ... move.l (%sp)+,%a0 move.l (%sp)+,%d3 move.l (%sp)+,%d2 move.l (%sp)+,%d1 move.l (%sp)+,%d0 unlk %a6 rts
RISC vs. CISC
In class, we are studying how to implement the MIPS ISA directly in hardware, including a pipelined implementation that handles complexities like data forwarding and hazard detection.
This is possible in a course such as this because of the simplicity of its design. MIPS instructions each have a single, simple function, all instructions are the same size (one 32-bit word), and there are very few addressing modes.
These are some of the features common to Reduced Instruction Set Computer (RISC) architectures, of which MIPS is an excellent example.
CISC Architectures
Other architectures are Complex Instruction Set Computer (CISC) architectures. Examples include IBM mainframe architectures, the Digital Equipment Corporation PDP-11 and VAX architectures, the Motorola 68000 series, and the Intel x86 family of architectures.
As the name indicates, and as you saw in the previous section, instructions in a CISC architecture can be quite complex.
CISC architectures are too complex to be implemented directly in hardware. In these cases, an implementation might involve a simpler hardware (a microarchitecture) and an interpreter that is used to execute instructions, guided by a microprogram.
Decoding and executing an instruction can require several steps (microinstructions), and the more complex the instruction (more operands, etc) the longer it will take.
Typical CISC architecture characteristics:
If you have to program directly in assembly language, a CISC architecture can make this task easer. However, compilers may make use of only a subset of available the available instructions.
RISC Architectures
Researchers (including our text authors) in the early 1980's advocated for the RISC approach.
Characteristics of RISC architectures:
If you have to program by hand, RISC architectures will be more difficult. The task is easier for, and intended to be done by, a compiler.
Sun SPARC
Sun Microsystems' SPARC architectures enjoyed a long period of success from the early 1990's to the early 2000's.
SPARC is a RISC architecture, and contains many of the same features as MIPS.
Register Windows
A RISC system may have hundreds of registers in its register file.
One way to organize these registers is to treat the register file as circular and give each routine a "view" of a limited subset of these registers at any given time.
Read the first SPARC has many, many registers but only 32 visible at any given time.
One subset of registers is designated as "globals" and are visible to all routines.
Then, each routine has three subsets of registers
The "outs" of a routine become the "ins" of a subroutine that it calls.
Much of what we could do with the stack for parameter passing and local variable storage can be replaced with this.
We eliminate some of the problems of registers getting clobbered by subroutines and we speed function calls by reducing memory accesses needed to pass parameters on the stack.
But what if we have a deep call stack and we run out of registers? We can't just clobber things, so we'd need to keep a stack also for those cases.
The actual implementation involves "pretending" to write things to the stack, but it only actually does this ("spill" to the stack, and "fill" from the stack to restore) if necessary given the current state of the register file.
Problem: what about functions with more parameters than can be passed in the register window? We still need the stack for those.
Other SPARC Features
A few final notes about SPARC:
x86/IA-32 ISA
This section presents some highlights of the text's discussion of the Intel x86 architecture in Section 2.17. Now that you know something about the MIPS ISA and its simplicity, that section will make for interesting reading.
A Brief History of the x86
x86/IA-32 Overview
x86 Basics: Registers, Data Types, and Memory
A diagram of the main registers can be found in Figure 2.36.
%eax | accumulator (for arithmetic ops) | |
%ebx | base (address of array in memory) | |
%ecx | count (of loop iterations) | |
%edx | data (e.g., second operand for binary operations) | |
%esi | source index (for string copy or array access) | |
%edi | destination index (for string copy or array access) | |
%ebp | base pointer (base of current stack frame) | |
%esp | stack pointer (top of stack) | |
%eip | instruction pointer (program counter) | |
%eflags | flags (condition codes and other things) | |
x86 ISA
addl %eax,%ebx # EBX <= EBX + EAX
addl $20,%esp # ESP <= ESP + 20
Beware: removing the dollar sign
addl 20,%esp # ESP <= ESP + M[20]
which specifies the contents of memory location 20 to be added to ESP rather than the value 20
movl $10,%esi # ESI <= 10 movl %eax,%ecx # ECX <= EAX xorl %edx,%edx # EDX <= 0
displacement(SR1,SR2,scale)
which multiplies SR2 by scale, then adds both SR1 and displacement.
This complex addressing supports array accesses generated by high-level programs.
For example, to access the ith element of an array of 32-bit integers, one could put a pointer to the base of the array into EBX and the index i into ESI, and execute
movw (%ebx,%esi,4),%eax # EAX <= M[EBX + ESI * 4]
If the array started at the 28th byte of a structure, and EBX instead held a pointer to the structure, one could still use this form by adding a displacement:
movw 28(%ebx,%esi,4),%eax # EAX <= M[EBX + ESI * 4 + 28]
scale can be 1, 2, 4, or 8, defaults to 1.
movb (%ebp),%al # AL <= M[EBP] movb -4(%esp),%al # AL <= M[ESP - 4] movb (%ebx,%edx),%al # AL <= M[EBX + EDX] movb 13(%ecx,%ebp),%al # AL <= M[ECX + EBP + 13] movb (,%ecx,4),%al # AL <= M[ECX * 4] movb -6(,%edx,2),%al # AL <= M[EDX * 2 - 6] movb (%esi,%eax,2),%al # AL <= M[ESI + EAX * 2] movb 24(%eax,%esi,8),%al # AL <= M[EAX + ESI * 8 + 24]
Figure 2.38 also shows addressing modes with MIPS equivalents.
movb 100,%al # AL <= M[100] movb label,%al # AL <= M[label] movb label+10,%al # AL <= M[label+10] movb 10(label),%al # NOT LEGAL! movb label(%eax),%al # AL <= M[EAX + label] movb 13+8*8-35+label(%edx),%al # AL <= M[EDX + label + 42] movw $label,%eax # EAX <= label movw $label+10,%eax # EAX <= label+10 movw $label(%eax),%eax # NOT LEGAL!
Note: most, but not all instructions affect all flags
All except can be inverted by inserting an "N" after the initial "J": JNB jumps if the carry flag is clear.
CALL pushes EIP, and its target can come from one of many addressing modes:
call printf # (push EIP), EIP <= printf call *%eax # (push EIP), EIP <= EAX call *(%eax) # (push EIP), EIP <= M[EAX] call *fptr # (push EIP), EIP <= M[fptr] call *10(%eax,%edx,2) # (push EIP), EIP <= M[EAX + EDX*2 + 10]
RET (return) pops the return address off the stack and into EIP.
x86 Wrapup
Just how complex is the x86 ISA today?
See Figure 2.43 for a summary.
Comparing Assembly Language Programs
We will consider the assembly language code generated from a few simple C programs for some of the ISAs we consider:
See Example:
~jteresco/shared/cs220/examples/assembly
Copy these to your own account or computer.
Here, you will find two C files, each of which contains a simple C function. You will also find compiler-generated assembly language code for 5 architectures: x86, M68K, MIPS, PPC, and SPARC.
You will notice that while each architecture has a very unique set of instructions, register names, and addressing modes, each is unmistakeably an assembly language program.
Consider the files multiply-mips-gcc.s, multiply-i386-gcc295.s, multiply-sparc-cc.s, multiply-sparc-gcc.s, multiply-m68k-gcc.s, and multiply-ppc-gcc.s when answering the following questions.
Submission and Grading
This lab is graded out of 25 points. The points per question are specified above.
By 4:00 PM, Monday, December 12, 2011, please submit your responses to the questions by email to jteresco AT siena.edu. You may supplement your submission with a handwritten response to some or all questions.