# COEN6741

Computer Architecture and Design

Chapter 2

#### **Instruction Set Principles**

(Dr. Sofiène Tahar)

#### Outline

- Introduction
- ISA Classes
- Addressing Modes
- Operands Type and Size
- Instruction Operations
- Instructions Formats
- Compiler Considerations
- MIPS R3000 Case Study

| 9/23  | /2003 |
|-------|-------|
| J/ 23 | 2005  |

COEN6741 Chap 2.1

9/23/2003

#### **Review:** Organization

- · All computers consist of five components
  - Processor: (1) datapath and (2) control
  - (3) Memory
  - (4) Input devices and (5) Output devices
- Not all "memory" are created equally
  - Cache: fast (expensive) memory are placed closer to the processor
  - Main memory: less expensive memory--we can have more
- Input and output (I/O) devices have the messiest organization
  - Wide range of speed: graphics vs. keyboard
  - Wide range of requirements: speed, standard, cost ...
  - Least amount of research (so far)

# **Review:** Computer System Components



 $\cdot$  All have interfaces & organizations

9/23/2003



COEN6741



#### Instruction Set Architecture: What must be specified?



#### **Basic ISA Classes**



#### Comparison:

9/23/2003

Bytes per instruction? Number of Instructions? Cycles per instruction?

COEN6741 Chap 2.7

#### **Basic ISA Classes**

#### Accumulator:

| 1 address       | add A        | $acc \leftarrow acc + mem[A]$     |  |
|-----------------|--------------|-----------------------------------|--|
| 1+x address     | addx A       | $acc \leftarrow acc + mem[A + x]$ |  |
| Stack:          |              |                                   |  |
| 0 address       | add          | tos $\leftarrow$ tos + next       |  |
| General Purpose | Register:    |                                   |  |
| 2 address       | add A B      | $EA(A) \leftarrow EA(A) + EA(B)$  |  |
| 3 address       | add A B C    | $EA(A) \leftarrow EA(B) + EA(C)$  |  |
| Load/Store:     |              |                                   |  |
| 3 address       | add Ra Rb Rc | Ra ← Rb + Rc                      |  |
|                 | load Ra Rb   | Ra ← mem[Rb]                      |  |

 $mem[Rb] \leftarrow Ra$ 

#### **Comparing Number of Instructions**

Code sequence for C = A + B for four classes of instruction sets:

| Stack  | Accumulator | Register<br>(register-memory) | Register<br>(load-store) |
|--------|-------------|-------------------------------|--------------------------|
| Push A | Load A      | Load R1,A                     | Load R1,A                |
| Push B | Add B       | Add R1,B                      | Load R2,B                |
| Add    | Store C     | Store C, R1                   | Add R3,R1,R2             |
| Рор С  |             |                               | Store C,R3               |

#### 9/23/2003

#### General Purpose Registers Dominate

- 1975-1995 all machines use general purpose registers
- Advantages of registers
  - registers are faster than memory

store Ra Rb

- registers are easier for a compiler to use
  - e.g., (A\*B) (C\*B) (A\*D) can do multiplies in any order vs. stack
- registers can hold variables
  - memory traffic is reduced, so program is sped up (since registers are faster than memory)
  - code density improves (since register named with fewer bit than memory location)

#### General Purpose Registers Dominate

- 1975-1995 all machines use general purpose registers
- Advantages of registers
  - registers are faster than memory
  - registers are easier for a compiler to use
  - e.g., (A\*B) (C\*B) (A\*D) can do multiplies in any order vs. stack
  - registers can hold variables
  - memory traffic is reduced, so program is sped up (since registers are faster than memory)
  - code density improves (since register named with fewer bits than memory location)

COEN6741

Chap 2.9

9/23/2003

COEN6741

#### Classification of GPR Architectures

| # Memory<br>Addresses | Max. #<br>Operands | Type of<br>Architecture | Examples                                         | Expect new instruction set architecture to<br>use general purpose register |  |
|-----------------------|--------------------|-------------------------|--------------------------------------------------|----------------------------------------------------------------------------|--|
| 0                     | 3                  | Register-Register       | Alpha,ARM, MIPS,<br>PowerPC, Sparc               | Pipelining (performance) expect it to use<br>Load/Store variant of GPR ISA |  |
| 1                     | 2                  | Register-Memory         | IBM360/370,Intel80×86,<br>MC68000, TI TMS320C54× |                                                                            |  |
| 2                     | 2                  | Memory-Memory           | VAX                                              |                                                                            |  |
| 3                     | 3                  | Memory-Memory           | VAX                                              |                                                                            |  |
| 9/23/2003             |                    |                         | COEN6741<br>Chap 2.13                            | 9/23/2003 COEN674<br>Chap 2,1                                              |  |

# Memory Addressing

- Since 1980 almost every machine uses addresses to level of 8-bits (byte)
- Two questions for design of ISA:
  - 1. Since we read a 32-bit word as four loads of bytes from sequential byte addresses or as one load word from a single byte address, how do byte addresses map onto words?
  - 2. Can a word be placed on any byte boundary?

#### Addressing Objects: Endianess and Alignment

Instruction Classes (Summary)

 Big Endian: address of most significant byte = word address (xx00 = Big End of word)
 IBM 360/370, Motorola 68k, MIPS, Spare, HP PA

big endian byte 0 0 1 2 3 msb lsb

 Little Endian: address of least significant byte = word address (xx00 = Little End of word)
 Intel 80x86, DEC Vax, DEC Alpha (Windows NT)





lsb Aligned

Alignment: require objects fall on Address that is multiple of their size. Not Aligned

COEN6741 Chap 2.15

#### Addressing Modes

| Addressing mode          | Example           | Meaning                                            |
|--------------------------|-------------------|----------------------------------------------------|
| Register                 | Add R4,R3         | R4← R4+R3                                          |
| Immediate                | Add R4,#3         | R4 ← R4+3                                          |
| Displacement             | Add R4,100(R1)    | R4 ← R4+Mem[100+R1]                                |
| <b>Register indirect</b> | Add R4,(R1)       | $R4 \leftarrow R4+Mem[R1]$                         |
| Indexed / Base           | Add R3,(R1+R2)    | $R3 \leftarrow R3$ +Mem[R1+R2]                     |
| Direct or absolute       | Add R1,(1001)     | R1 ← R1+Mem[1001]                                  |
| Memory indirect          | Add R1,@(R3)      | $R1 \leftarrow R1+Mem[Mem[R3]]$                    |
| Auto-increment           | Add R1,(R2)+      | $R1 \leftarrow R1 + Mem[R2]; R2 \leftarrow R2 + d$ |
| Auto-decrement           | Add R1,-(R2)      | $R2 \leftarrow R2-d; R1 \leftarrow R1+Mem[R2]$     |
| Scaled                   | Add R1,100(R2)[R3 | ] R1← R1+Mem[100+R2+R3*d]                          |
| 9/23/2003                |                   | COEN6741<br>Chap 2.17                              |

#### Addressing Mode Usage (Summary)

| 3 programs measured on machine with all address modes (VAX) |                           |  |  |  |  |
|-------------------------------------------------------------|---------------------------|--|--|--|--|
| • Displacement: 42% avg, 32% to 55 75%                      |                           |  |  |  |  |
| • Immediate:                                                | 33% avg, 17% to 43% 🖡 85% |  |  |  |  |
| $\cdot$ Register deferred (indirect                         | ·): 13% avg, 3% to 24%    |  |  |  |  |
| • Scaled:                                                   | 7% avg, 0% to 16%         |  |  |  |  |
| <ul> <li>Memory indirect:</li> </ul>                        | 3% avg, 1% to 6%          |  |  |  |  |
| • Misc:                                                     | 2% avg, 0% to 3%          |  |  |  |  |
| 75% displacement & immedia                                  | te                        |  |  |  |  |

#### 75% displacement & immediate

9/23/2003

88% displacement, immediate & register indirect

## Addressing Mode Usage



#### Displacement Address Size?



- Avg. of 5 SPECint92 programs v. avg. 5 SPECfp92 programs
- X-axis is in powers of 2: 4 => addresses > 23 (8) and < 2 4 (16)</li>
- 1% of addresses > 16-bits
- 12 16 bits of displacement needed

COEN6741 Chap 2.19



#### Operand Size Usage



#### Support these data Sizes and types

8-bit, 16-bit, 32-bit integers and
32-bit and 64-bit IEEE 754 floating-point

```
COEN6741
Chap 2.25
```

### **Typical Operations**

|           | Data Movement            | Load (from memory)<br>Store (to memory)<br>memory-to-memory move<br>register-to-register move<br>input (from I/O device)<br>output (to I/O device)<br>push, pop (to/from stack) |
|-----------|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|           | Arithmetic               | integer (binary + decimal) or FP<br>Add, Subtract, Multiply, Divide                                                                                                             |
|           | Shift                    | shift left/right, rotate left/right                                                                                                                                             |
|           | Logical                  | not, and, or, set, clear                                                                                                                                                        |
|           | Control (Jump/Branch)    | unconditional, conditional                                                                                                                                                      |
|           | Subroutine Linkage       | call, return                                                                                                                                                                    |
|           | Interrupt                | trap, return                                                                                                                                                                    |
|           | Synchronization          | test & set (atomic r-m-w)                                                                                                                                                       |
| 9/23/2003 | String<br>Graphics (MMX) | search, translate<br>parallel subword ops (4 16bit add)<br>COEN6741<br>Chap 2.26                                                                                                |

# Top 10 80×86 Instructions

| Rank | Instruction Int    | teger average percent executed |
|------|--------------------|--------------------------------|
| 1    | load               | 22%                            |
| 2    | conditional branch | 20%                            |
| 3    | compare            | 16%                            |
| 4    | store              | 12%                            |
| 5    | add                | 8%                             |
| 6    | and                | 6%                             |
| 7    | sub                | 5%                             |
| 8    | move register-regi | ster 4%                        |
| 9    | call               | 1%                             |
| 10   | return             | 1%                             |
|      | Total              | 96%                            |

#### Simple instructions dominate instruction frequency

#### Instruction Operation (Summary)

Support these simple instructions, since they will dominate the number of instructions executed:

► load

> store

- > add, subtract, move register-register, and, shift
- compare equal, compare not equal
- > branch, jump, call, return;

COEN6741 Chap 2.27

```
9/23/2003
```

#### Instruction for Control Flow

- Conditional branches
- Jumps
- Procedure calls
- Procedure returns



# Methods of Testing Condition



## Frequency of Types of Compares



Setting CC as side effect can reduce the # of instructions

| X: |                         |     | <b>X</b> : | •   |                      |
|----|-------------------------|-----|------------|-----|----------------------|
|    | •                       | VS. |            | •   |                      |
|    | SUB rO, #1, rO<br>BRP X |     |            |     | r0, #1, r0<br>r0, #0 |
|    |                         |     |            | BRP | Х                    |

But also has disadvantages:

- not all instructions set the condition codes; which do and which do not often confusing! e.g., shift instruction sets the carry bit
- dependency between the instruction that sets the CC and the one that tests it: to overlap their execution, may need to separate them with an instruction that does not change the CC





# Generic Examples of Instruction Format Widths

| Operation and<br>no. of operands | Address<br>specifier 1          | Address<br>field 1          | ··-       | Address<br>specifier | Address |
|----------------------------------|---------------------------------|-----------------------------|-----------|----------------------|---------|
| a) Variable (e.g.,               | VAX, Intel 80x86                | 5)                          |           |                      | •       |
|                                  |                                 |                             |           |                      |         |
|                                  |                                 |                             |           |                      |         |
| Operation                        | Address                         | Address                     | Addre     |                      |         |
|                                  | field 1                         | field 2                     | field 3   | •                    |         |
|                                  |                                 |                             | RC, Super | rH)                  |         |
| b) Fixed (e.g., Alp<br>Operation | Address                         | Address                     | RC, Super | rH)                  |         |
|                                  |                                 | Address                     | RC, Super | rH)                  |         |
|                                  | Address                         | Address                     | RC, Super |                      |         |
| Operation                        | Address<br>specifier            | Address<br>field            |           |                      |         |
| Operation                        | Address<br>specifier<br>Address | Address<br>field<br>Address | Addre     |                      |         |
| Operation                        | Address<br>specifier<br>Address | Address<br>field<br>Address | Addre     | ss                   |         |

© 2003 Elsevier Science (USA). All rights reserved.

9/23/2003

#### **Instruction Formats (Summary)**

- If code size is most important, use variable length instructions
- If performance is over is most important, use fixed length instructions
- Discuss the different architectures for different machines, see Appendix D (Intel 80x86), E (VAX), F (IBM 360/370)

#### **Instruction Formats**

| Variable: |  |  |  |
|-----------|--|--|--|
| Fixed:    |  |  |  |
| Hybrid:   |  |  |  |
|           |  |  |  |
|           |  |  |  |

- Addressing modes
  - each operand requires address specifier => variable format
- code size => variable length instructions
- performance => fixed length instructions
  - simple decoding, predictable operations
- $\cdot$  With load/store ISA, only one memory address and few addressing modes
- > simple format, address mode given by opcode

COEN6741 Chap 2.38

#### **Compiler Considerations**



9/23/2003

COEN6741

# **Compiler Considerations**



#### **Compiler Considerations**

#### Ease of compilation

- > Orthogonality: no special registers, few special cases, all operand modes available with any data type or instruction type
- > Completeness: support for a wide range of operations and target applications
- > Regularity: no overloading for the meanings of instruction fields
- > Streamlined: resource needs easily determined
- Register Assignment is critical too
- Easier if lots of registers



#### Compiler Considerations (Summary)

© 2003 Elsevier Science (USA). All rights reserved.

- > Provide at least 16 general purpose registers plus separate floating-point registers
- > Be sure all addressing modes apply to all data transfer instructions
- > Aim for a minimalist instruction set

#### Case Study: MIPS

- 32-bit fixed format instructions (3 formats: R, I, J)
- 32 64-bit GPR (RO contains zero) and 32 FP registers (and HI LO) - partitioned by software convention
- · 3-address, reg-reg arithmetic instructions
- Single address mode for load/store: base+displacement - no indirection, scaled
- 16-bit immediate plus LUI
- Simple branch conditions
  - compare against zero or two registers for equal zero
  - no integer condition codes
- Delayed branch

- execute instruction after the branch (or jump) even if the branch is taken (Compiler can fill a delayed branch with useful work about 50% of the time)

COEN6741 Chap 2.43

9/23/2003

5741

2.41

9/23/2003

COEN6741 Chap 2.44

COEN6741

#### Case Study: MIPS

- Use general purpose registers with a load-store architecture: <u>YES</u>
- Provide at least 16 general purpose registers plus separate floating-point registers: <u>31 GPR & 32 FPR</u>
- Support basic addressing modes: displacement (with an address offset size of 12 to 16 bits), immediate (size 8 to 16 bits), and register deferred; : <u>YES: 16</u> <u>bits for immediate, displacement (disp=0 => register</u> <u>deferred)</u>
- All addressing modes apply to all data transfer instructions : <u>YES</u>

# Case Study: MIPS

- Use fixed instruction encoding if interested in performance and use variable instruction encoding if interested in code size : <u>Fixed</u>
- Support these data sizes and types: 8-bit, 16-bit, 32-bit integers and 32-bit and 64-bit IEEE 754 floating point numbers: <u>YES</u>
- Support these simple instructions, since they will dominate the number of instructions executed: <u>load</u>, <u>store</u>, <u>add</u>, <u>subtract</u>, <u>move register-register</u>, <u>and</u>, <u>shift</u>, <u>compare equal</u>, <u>compare not equal</u>, <u>branch</u> (with a PC-relative address at least 8-bits long), jump, <u>call</u>, and <u>return</u>: <u>YES</u>
- Aim for a minimalist instruction set: <u>YES</u>

#### **MIPS: Load/Store Architecture**

· 3 address GPR

9/23/2003

- Register to register arithmetic
- Load and store with simple addressing modes (reg + immediate)
- Simple conditionals
  - compare ops + branch z
  - compare & branch
- I op r r J op offset

R

- condition code + branch on condition
- $\cdot$  Simple fixed-format encoding
- Substantial increase in instructions
- > Decrease in data BW (due to many registers)
- > Even more significant decrease in CPI (pipelining)
- > Cycle time, Real estate, Design time, Design complexity

# op r r r op r r immed

# **MIPS:** Instruction Set

| r00<br>r1 | Programmable storage          | Data types ?      |
|-----------|-------------------------------|-------------------|
| •         | 2^32 x <u>bytes</u>           | Format ?          |
| o<br>o    | 31 x 32-bit GPRs (R0=0)       | Addressing Modes? |
| r31       | 32 x 32-bit FP regs (paired   | DP)               |
| PC<br>lo  | HI, LO, PC                    |                   |
| hi 🗠 🗌    | 32-bit instructions on word b | oundary           |

#### Arithmetic logical

Add, AddŪ, Sub, SubU, And, Or, Xor, Nor, SLT, SLTU, AddI, AddIU, SLTI, SLTIU, AndI, OrI, XorI, LUI SLL, SRL, SRA, SLLV, SRLV, SRAV

#### Memory Access

LB, LBU, LH, LHU, LW, LWL, LWR SB, SH, SW, SWL, SWR

#### Control

J, JAL, JR, JALR

9/23/2003 BEq, BNE, BLEZ, BGTZ, BLTZ, BGEZ, BLTZAL, BGEZAL

COEN6741 Chap 2.48

COEN6741

Chap 2.46



COEN6741

Chap 2.45

Chap 2.47

#### **MIPS:** Addressing Modes & Formats

#### • Simple addressing modes All instructions 32 bits wide 65 31 26 25 2120 16 15 1110 0 Opx Op Rs1 Rs2 Rd **Register (direct)** rt rd ор rs **Register-Immediate** 26 25 31 2120 16 15 0 register immediate Rs1 Rd Op Immediate ор rs rt immed Branch 31 26 25 2120 16 15 0 **Base+index** ор rs rt immediate immed Оp Rs1 Rs2/Op> Memory Jump / Call register + 31 26 25 0 **PC-relative** target Op immed op rs rt Memory PC + COEN6741 9/23/2003 9/23/2003 Chap 2.49

# **MIPS:** Instruction Formats



© 2003 Elsevier Science (USA). All rights reserved.

#### MIPS ISA (Summary)

- Instruction Categories
  - Load/Store

**Register-Register** 

- Computational
- Jump and Branch
- Floating Point
  - » coprocessor
- Memory Management
- Special

Registers

| R0 - R31 |
|----------|
| PC       |
| HI       |
| LO       |

• 3 Instruction Formats: all 32 bits wide

| OP             | rs | rt | rd        | sa | funct |  |
|----------------|----|----|-----------|----|-------|--|
| OP             | rs | rt | immediate |    |       |  |
| OP jump target |    |    |           |    |       |  |

COEN6741

Chap 2.50

# **MIPS:** Instruction Formats

| Cray-1: The Original RISCRegister-Register $15$ $9$ $8$ $6$ $5$ $3$ $0$ $0$ DopRdRs1R2Load, Store and Branch $15$ $9$ $8$ $6$ $5$ $3$ $2$ $0$ $15$ $0$ $R$ $R$ $R$ $1$ Immediate                                                                                                                                                         | 0                     | <ul> <li>VAX-11: The Canonical CISC</li> <li>Variable format, 2 and 3 address instruction Byte 0 1 n m m m m m m m m m m m m m m m m m m</li></ul> |                       |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|--|
| 9/23/2003                                                                                                                                                                                                                                                                                                                                | COEN6741<br>Chap 2.53 | 9/23/2003                                                                                                                                          | COEN6741<br>Chap 2.54 |  |
| Chapter 2: Summary#1                                                                                                                                                                                                                                                                                                                     |                       | Chapter 2: Summary#2                                                                                                                               |                       |  |
| <ul> <li>ISA: GPR with Load/Store</li> <li>Addressing modes:         <ul> <li>Displacement (12 to 16 bits)</li> <li>Immediate (8 to 16 bits)</li> <li>Register deferred</li> </ul> </li> </ul>                                                                                                                                           |                       | <ul> <li>Instruction Set Encoding:</li> <li>Fixed encoding</li> <li>Hybrid encoding</li> <li>Size of Register File:</li> </ul>                     |                       |  |
| <ul> <li>Instruction Set Operations:         <ul> <li>Load, store</li> <li>Arithmetic, logic, shift, compare</li> <li>Branch (PC-relative 8 bit), jump, call, return</li> </ul> </li> <li>Type &amp; Size of Operands:         <ul> <li>Integer 8, 16 and 32 bit</li> <li>Floating-point (IEEE 754) 32 and 64 bit</li> </ul> </li> </ul> |                       | <ul> <li>At least 16 GP registers</li> <li>Separate integer (32 bit) and FP (64 bit) register files</li> <li>CISC vs. RISC:</li> </ul>             |                       |  |
|                                                                                                                                                                                                                                                                                                                                          |                       | - Pros. and cons<br>- Intel: typical CISC<br>- Alpha: typical RISC                                                                                 |                       |  |
| 9/23/2003                                                                                                                                                                                                                                                                                                                                | COEN6741              | • MIPS R3000 Architecture                                                                                                                          | COEN6741              |  |

9/23/2003

COEN6741 Chap 2.56