Computer Science 220
Assembly Language & Computer Architecture

Fall 2010, Siena College

Lecture 17: Pipelined Data Path and Control; Pipeline Hazards
Date: Thursday, December 2, 2010

Agenda

Lecture Assignment 17

Due at the start of class, Tuesday, December 7.

Please submit answers to these questions either as a hard copy (typeset or handwritten are OK) or by email to jteresco AT siena.edu by the start of class. We will discuss these questions at the start of class, so no late submissions are accepted. The textbook problems are reproduced here to aid those with a different version of the text.

  1. P&H Exercise 4.13, parts 1b, 2b, and 3b only
    In this exercise, we examine how data dependences affect execution in the basic five-stage pipeline described in Section 4.5. Problems in this exercise refer to the following sequence of instructions:
    lw $5, -16($5)
    sw $5, -16($5)
    add $5, $5, $5
    

    4.13.1. Indicate dependences and their type.
    4.13.2. Assume there is no forwarding in this pipelined processor. Indicate hazards and add nop instructions to eliminate them.
    4.13.3. Assume there is full forwarding. Indicate hazards and add nop instructions to eliminate them.

  2. P&H Exercise 4.15, parts 1b and 3b only
    In this exercise, we examine how the ISA affects pipeline design. Problems in this exercise refer to the following new instruction:
    swi Rd, Rs(Rt)
    

    which has the effect: Mem[Rs+Rt] = Rd
    4.15.1. What must be changed in the pipelined datapath to add this instruction to the MIPS ISA?
    4.15.3. Does support for this instruction introduce any new hazards? Are stalls due to existing hazards made worse?

  3. P&H Exercise 4.17, parts 4a, 5a, 6a only
    Each pipeline stage in Figure 4.33 has some latency. Additionally, pipelining introduces registers between stages (Figure 4.35), and each of these adds an additional latency. The remaining problems in this exercise assume the following latencies for logic within each pipeline stage and for each register between the two stages:
    a. IF: 100, ID: 120, EX: 90, MEM: 130, WB: 60, Pipeline register: 10
    4.17.4. Assuming there are no stalls, what is the speed-up achieved by pipelining a single-cycle datapath?
    4.17.5. We can convert all load/store instructions into register-based (no offset) and put the memory access in parallel with the ALU. What is the clock cycle time if this is done in the single-cycle and in the pipelined datapath? Assume that the latency of the new EX/MEM stage is equal to the longer of their latencies.
    4.17.6. The change in 4.17.5 requires many existing lw/sw instructions to be converted into two-instruction sequences. If this is needed for 50% of these instructions, what is the overall speed-up achieved by changing from the five-stage pipeline to the four-stage pipeline where EX and MEM are done in parallel?