CS Colloquium this afternoon: Craig Kaplan,
University of Waterloo, "Complexity and Aesthetics in
Computer-Generated Mazes". 2:35 PM TCL 206, snacks upstairs
before the talk.
Lab 6 back (finally)
Labs 8 and 9 continue
More on the exam
topic cutoff: end of last class
ground rules: take home exam, 47:10 to do it (use as much
of that time as you'd like), same resources legal as last time
practice exam - from last year
we covered the same topics, but you already had one exam,
emphasis on things since that exam
time reserved in class on Monday for questions and answers
Microarchitecture
Basics: calculator device (see Lecture 22 notes)
Tanenbaum's MIC1 Microarchitecture
As defined in a previous version of the Tanenbaum text - see handout.
The MIC1 is a minimalist microcarchitectural datapath (Tanenbaum
4-8). It contains:
A scratchpad or register file.
Each register may
be independently loaded onto two buses (A and B) and
loaded from a third (C). At most one register is attached to
each bus at a time.
Each register has control lines that govern the
targeted output buses, and a strobe line that allows latching from
the input bus.
The strobe line on some register is high when the
enable C line is high (ENC).
An ALU-shifter pair.
The ALU supports (in Tanenbaum) four
operations: A+B, A AND B, NOT(A), and A.
The shifter supports three operations: logical shift left or right by
one bit, or do nothing.
Both require two control lines; the ALU
delivers N and Z bits that describe the state of the
current value computed by the ALU. Both units are combinational.
Two buses (A and B) that deliver data from the
scratchpad to the ALU, and a bus C that delivers data from the
shifter to the scratchpad.
In many respects, the performance and general success of the
architecture hinges on the design of its datapath.
The size of this data path from scratchpad through the
ALU/shifter back to the scratchpad has to be short enough that
information can make it all the way around in a "cycle"
It also contains a minimalist memory interface.
Memory is adjacent to the CPU through the use of an address bus
and a data bus. The outward, unidirectional address bus is driven
by the memory address register (MAR). The bidirectional
data bus is read or written by the memory buffer register
(MBR).
Reading and writing are governed by two lines: RD and
WR. It is possible that neither are high.
The MAR is a simple register, optionally loaded from the latch
that captures the B bus.
The MBR is a dual-ported register that may be loaded
(MBR high) from the data bus (on RD), or from the C
bus (on WR). Its value is always available at the ALU as
an alternative to the A bus; the signal AMUX controls this
alternative.
Memory is much slower than the data path, but we will assume
it's only half the speed (takes two data path cycles for a memory
request to complete).
In real life, there is a much larger discrepency.
Building the components:
Recall: Design of a generic register.
For n-bit register (e.g., MAR):
n D-type latches/flip-flops
n input lines
CLK input to strobe in data from inputs
registers in MIC1 are 16 bits
Design of the scratchpad register.
all 16 look like this
Ai, Bi, Ci are decoded outputs of the A, B, C lines
ENC enables input from the C bus, so the value is copied in
only when this register is selected by the C lines and ENC
enables the C bus
The tristate buffers activating the output onto the A
bus or B bus pass through that output when this register is
selected to drive that bus
Design of the dual-ported register.
used for the MBR
Design of a bit-slice of the ALU.
The ALU has functions:
00
A+B
01
A and B
10
NOT A
11
A
consider a one-bit slice of the ALU
we can add some logic to compute the Z status bit by
ORing together all of the 1-bit outputs of the ALU slices
Recall: A full-featured version of the shifter.
See Fig. 3-16 of Tanenbaum 2006.
Completing the blueprint: the control store:
A (horizontal) microinstruction is simply the
collection of all the control lines necessary to determine the path
of data in a cycle about the datapath.
In Tanenbaum's architecture this consists of 22 bits:
4 each for addresses of registes for the
A, B and C buses
6 for AMUX, MAR, MBR, RD, WRT,
and ENC
2 for the ALU operation
2 for the shifter operation
The collection of potentially useful microinstructions--the
microprogram--is stored in a small, addressable memory, the
control store. In Tanenbaum, there are 256 microcode
instructions.
A microprogram counter or MPC is an 8-bit value that
selects the current microinstruction to be executed.
The next microinstruction appears next in the control store. An
incrementor computes the next MPC, keeping only 8 bits (ie. mod
256).
Conditional branching (and loops, etc.) are made possible by
extending each microinstruction with an alternative microinstruction
address (8 bits), and two bits that select the alternative
instruction when certain ALU conditions are met.
The microsequencer determines the interpretation of these bits.
The total is, then, 22+8+2=32 bits per microinstruction.
There is nothing special about the number 32 in this case;
microcode could have, say, 31 bits or 47 bits.
Notice that the elimination of a single bit
from the microinstruction would allow the addition of 8 more
instructions in the same space.
Microsequencer design
The microprogram counter (MPC) is an address to the
control store (0..255).
The microinstruction is the set of bits selected by the MPC and is
strobed into the microinstruction register (MIR) for execution.
The MPC is automatically incremented and made available as a
possible next instruction.
An alternative is the ADDR bits (0..255) of the MIR.
The particular next MPC is selected by adjusting the MUX,
controlled by the microsequencing logic. This logic simply takes the
ALU status bits (N and Z) along with the 2 COND bits of the
MIR to determine the appropriate setting of the MUX.
Tanenbaum's machine has
COND
MUX setting
Comments
00
0
don't branch
01
!N
Branch if N
10
!Z
Branch if Z
11
1
always branch
Quick work with a Karnaugh map identifies the appropriate logic as
C1Z+C1C0+C0N, where COND has bits C1C0.
Timing.
We need to coordinate the flow of information through the datapath.
Tanenbaum uses a 4-cycle clock. See Figure 4-5 in Tanenbaum 1990.
One full cycle during which a complete operation can be performed is
comprised of 4 sub-cycles:
subcycle 1: Load MIR, let signals settle.
subcycle 2: Latch A and B buses, let signals
settle. (Note AMUX was set on load of MIR).
subcycle 3: Load MAR and let ALU, status bits, shifter and
C bus settle.
subcycle 4: Load MBR and scratchpad (if ENC) from
C bus.
The entire MIC1 microarchitecture is represented by the diagram in
Tanenbaum 1990's Figure 4-10.