| MC<br>HORGAN KAUPMANN | Computer Architecture A Quantitative Approach, Fifth Edition |  |  |  |
|-----------------------|--------------------------------------------------------------|--|--|--|
| COMPUTER ARCHITECTURE | Chapter 2  Memory Hierarchy Design  DRAM and Virtual memory  |  |  |  |
| M<                    | Copyright © 2012, Elsevier Inc. All rights reserved.         |  |  |  |

# **Memory Technology**

- Performance metrics
  - Latency is concern of cache
  - Bandwidth is concern of multiprocessors and I/O
  - Access time
    - Time between read request and when desired word arrives
  - Cycle time
    - Minimum time between unrelated requests to memory
- DRAM used for main memory, SRAM used for cache

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

# **Memory Technology**

- SRAM
  - Requires low power to retain bit
  - Requires 6 transistors/bit
- DRAM
  - Must be re-written after being read
  - Must also be periodically refreshed
    - Every ~ 8 ms
    - Each row can be refreshed simultaneously
  - One transistor/bit
  - Address lines are multiplexed:
    - Upper half of address: row access strobe (RAS)
    - Lower half of address: column access strobe (CAS)

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

| - |   |   |   |  |
|---|---|---|---|--|
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   | - | _ | - |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |
|   |   |   |   |  |







| DDAM                                                                                                                                                                                                                                   |  |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| DRAM                                                                                                                                                                                                                                   |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
| Copyright © 2012, Elsevier Inc. All rights reserved.                                                                                                                                                                                   |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
| 3                                                                                                                                                                                                                                      |  |
| Memory Technology                                                                                                                                                                                                                      |  |
| y y                                                                                                                                                                                                                                    |  |
| ■ Amdahl:                                                                                                                                                                                                                              |  |
| Memory capacity should grow linearly with processor speed                                                                                                                                                                              |  |
| <ul> <li>Unfortunately, memory capacity and speed has not kept</li> </ul>                                                                                                                                                              |  |
| pace with processors                                                                                                                                                                                                                   |  |
| pass man processors                                                                                                                                                                                                                    |  |
| Some optimizations:                                                                                                                                                                                                                    |  |
| Multiple accesses to same row                                                                                                                                                                                                          |  |
| Synchronous DRAM                                                                                                                                                                                                                       |  |
| Added clock to DRAM interface                                                                                                                                                                                                          |  |
| <ul> <li>Burst mode with critical word first</li> </ul>                                                                                                                                                                                |  |
| <ul> <li>Wider interfaces</li> </ul>                                                                                                                                                                                                   |  |
| <ul><li>Double data rate (DDR)</li></ul>                                                                                                                                                                                               |  |
| <ul> <li>Multiple banks on each DRAM device</li> </ul>                                                                                                                                                                                 |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
| Copyright © 2012, Elsevier Inc. All rights reserved.                                                                                                                                                                                   |  |
| Cupyright © 2012, Elsevier inc. All rights reserved.                                                                                                                                                                                   |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
|                                                                                                                                                                                                                                        |  |
| M O (' ' ('                                                                                                                                                                                                                            |  |
| Memory Optimizations                                                                                                                                                                                                                   |  |
|                                                                                                                                                                                                                                        |  |
| Row access strobe (RAS)                                                                                                                                                                                                                |  |
| Slowest   Fastest Column access strobe (CAS)   Cycle                                                                                                                                                                                   |  |
| 1980 64K bit DRAM 180 150 75 250<br>1983 256K bit DRAM 150 120 .50 220                                                                                                                                                                 |  |
| 1985 250K bit DRAM 150 120 50 220<br>1986 1M bit DRAM 120 100 25 190                                                                                                                                                                   |  |
| 1989 4M bit DRAM 100 80 20 165                                                                                                                                                                                                         |  |
| 1992   16M bit   DRAM   80   60   15   120                                                                                                                                                                                             |  |
| 1998 128M bit SDRAM 70 50 10 100                                                                                                                                                                                                       |  |
| 2000 256M bit DDR1 65 45 7 90<br>2002 512M bit DDR1 60 40 5 80                                                                                                                                                                         |  |
| 2004 1G bit DDR2 55 35 5 70                                                                                                                                                                                                            |  |
| 2006 2G bit DDR2 50 30 2.5 60<br>2010 4G bit DDR3 36 28 1 37                                                                                                                                                                           |  |
| 2012 8G bit DDR3 30 24 0.5 31                                                                                                                                                                                                          |  |
| Figure 2.13 Times of fast and slow DRAMs vary with each generation. (Cycle time is defined on page 95.) Perfor-                                                                                                                        |  |
| mance improvement of row access time is about 5% per year. The improvement by a factor of 2 in column access in<br>1986 accompanied the switch from NIMOS DRAMS to CMOS DRAMS. The introduction of various burst transfer              |  |
| modes in the mid-1990s and SDRAMs in the late 1990s has significantly complicated the calculation of access time<br>for blocks of data; we discuss this later in this section when we talk about SDRAM access time and power. The DDR4 |  |
| designs are due for introduction in mid- to late 2012. We discuss these various forms of DRAMs in the next few pages.                                                                                                                  |  |
| Copyright © 2012, Elsevier Inc. All rights reserved.                                                                                                                                                                                   |  |

Copyright © 2012, Elsevier Inc. All rights reserved.

### **Memory Optimizations**

| Standard | Clock rate (MHz) | M transfers per second | DRAM name | MB/sec /DIMM  | DIMM name |
|----------|------------------|------------------------|-----------|---------------|-----------|
| DDR      | 133              | 266                    | DDR266    | 2128          | PC2100    |
| DDR      | 150              | 300                    | DDR300    | 2400          | PC2400    |
| DDR      | 200              | 400                    | DDR400    | 3200          | PC3200    |
| DDR2     | 266              | 533                    | DDR2-533  | 4264          | PC4300    |
| DDR2     | 333              | 667                    | DDR2-667  | 5336          | PC5300    |
| DDR2     | 400              | 800                    | DDR2-800  | 6400          | PC6400    |
| DDR3     | 533              | 1066                   | DDR3-1066 | 8528          | PC8500    |
| DDR3     | 666              | 1333                   | DDR3-1333 | 10,664        | PC10700   |
| DDR3     | 800              | 1600                   | DDR3-1600 | 12,800        | PC12800   |
| DDR4     | 1066-1600        | 2133-3200              | DDR4-3200 | 17,056-25,600 | PC25600   |

DDR4 1066–1600 2133-3200 DDR4-3200 17.066–25,600 PC25600
Figure 2.14 Clock rates, bandwidth, and names of DDR DDRAMS and DDMMs in 2010. Note the number from the third column is twice the second, and the flourth uses the number from the third column in the name of the DRAM chip. The fifth column is eight times the third column, and a rounded version of this number is used in the name of the DRAM chip. The fifth column is eight times the third column, and a rounded version of this number is used in the name of the DRAM chips, the solid possible column is colored to the column in the same colored to the column is eight times the third column, and a rounded version of this what does this mean With a 1 ns. colock (clock cycle one-half the transfer sate), this indicate a fine for two to colored what does not be mean With a 1 ns. colored (clock cycle one-half the transfer sate), this indicate a fine of 28 ns. Closing the row takes 9 ns for pre-large but happens only when the reads from that two are finished. In burst mode, transfers occur on every clock on both edge, when the first RAS and CAS times have elapsed. Furthermore, the precharge in not needed until the entire row is read. DRAW will be produced in 2012 and is espected to reach clock rates of 1600 MHz in 2014, when DORS is expected to take over. The exercises explore these details further.

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

10

### **Avoiding memory banks Conflicts**

- Suppose that we have 128 banks, and we will store 512x512 array.
- All the elements of a row will be mapped to the same bank (conflicts if we access a row.
- Usually, the number of banks is a power of 2, in this case
- Bank number = address MOD number of banks
- Address within a bank =Address/Number of banks
- This is a trivial calculation if the number of banks is a power of 2.
- If the number of memory banks is a prime number, that will decrease conflicts, but division and MOD will be very expensive

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

11

### **Avoiding memory Banks Conflicts**

- MOD can be calculated very efficiently if the prime number is 1 less than a power of 2.
- Division still a problem
- But if we change the mapping such that
- Address in a bank = address MOD number of words in a bank.
- Since the number of words in a bank is usually a power of 2, that will lead to a very efficient implementation.
- Consider the following example, the first case is the usual 4 banks, then 3 banks with sequential interleaving and modulo interleaving and notice the conflict free access to rows and columns of a 4 by 4 matrix

| ₩. |  |
|----|--|
|    |  |
|    |  |

Copyright © 2012, Elsevier Inc. All rights reserved.

| _ |  |  |  |
|---|--|--|--|
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
| - |  |  |  |
| _ |  |  |  |
|   |  |  |  |
| - |  |  |  |
|   |  |  |  |
| _ |  |  |  |
| _ |  |  |  |
|   |  |  |  |
| - |  |  |  |
| _ |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
| - |  |  |  |
|   |  |  |  |
|   |  |  |  |
| _ |  |  |  |
|   |  |  |  |
| - |  |  |  |
| _ |  |  |  |
|   |  |  |  |
| - |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |

| E  | Example                                              |    |    |    |    |    |    |          |    |    |    |
|----|------------------------------------------------------|----|----|----|----|----|----|----------|----|----|----|
|    |                                                      |    |    |    |    |    |    |          |    |    |    |
|    | Add in a bank                                        |    |    |    |    | SE | Q  |          | М  | О  | D  |
|    |                                                      | 0  | 1  | 2  | 3  | 0  | 1  | 2        | 0  | 1  | 2  |
|    | 0                                                    | 0  | 1  | 2  | 3  | 0  | 1  | 2        | 0  | 16 | 8  |
|    | 1                                                    | 4  | 5  | 6  | 7  | 3  | 4  | 5        | 9  | 1  | 17 |
|    | 2                                                    | 8  | 9  | 10 | 11 | 6  | 7  | 8        | 18 | 10 | 2  |
|    | 3                                                    | 12 | 13 | 14 | 15 | 9  | 10 | 11       | 3  | 19 | 11 |
|    | 4                                                    | 16 | 17 | 18 | 19 | 12 | 13 | 14       | 12 | 4  | 20 |
|    | 5                                                    | 20 | 21 | 22 | 23 | 15 | 16 | 17       | 21 | 13 | 5  |
|    | 6                                                    | 24 | 25 | 26 | 27 | 18 | 19 | 20       | 6  | 22 | 14 |
|    | 7                                                    | 28 | 29 | 30 | 31 | 21 | 22 | 23       | 15 | 7  | 23 |
|    |                                                      |    |    |    |    |    |    |          |    |    |    |
| M< | Copyright © 2012, Elsevier Inc. All rights reserved. |    |    |    |    |    |    | eserved. |    |    |    |

# **Memory Optimizations**

- DDR:
  - DDR2
    - Lower power (2.5 V -> 1.8 V)
  - Higher clock rates (266 MHz, 333 MHz, 400 MHz)
  - DDR3
    - 1.5 V
    - 800 MHz
  - DDR4
    - 1-1.2 V
    - 1600 MHz
- GDDR5 is graphics memory based on DDR3

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

# **Memory Optimizations**

- Graphics memory:
  - Achieve 2-5 X bandwidth per DRAM vs. DDR3
    - Wider interfaces (32 vs. 16 bit)
    - Higher clock rate
      - Possible because they are attached via soldering instead of socketted DIMM modules
- Reducing power in SDRAMs:
  - Lower voltage
  - Low power mode (ignores clock, continues to refresh)

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

| <br>_ |
|-------|
|       |
|       |
| _     |
| _     |
| _     |
|       |



### Flash Memory

- Type of EEPROM
- Must be erased (in blocks) before being overwritten
- Non volatile
- Limited number of write cycles
- Cheaper than SDRAM, more expensive than disk
- Slower than SRAM, faster than disk

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

# **Memory Dependability**

- Memory is susceptible to cosmic rays
- Soft errors: dynamic errors
  - Detected and fixed by error correcting codes (ECC)
- Hard errors: permanent errors
  - Use sparse rows to replace defective rows
- Chipkill: a RAID-like error recovery technique

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

| _ |  |  |  |
|---|--|--|--|
|   |  |  |  |
| • |  |  |  |
| • |  |  |  |
| • |  |  |  |
| • |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
|   |  |  |  |
| • |  |  |  |
| • |  |  |  |
|   |  |  |  |
|   |  |  |  |

### **Virtual Memory**

- Protection via virtual memory
  - Keeps processes in their own memory space
- Role of architecture:
  - Provide user mode and supervisor mode
  - Protect certain aspects of CPU state
  - Provide mechanisms for switching between user mode and supervisor mode
  - Provide mechanisms to limit memory accesses
  - Provide TLB to translate addresses

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

10

### **Virtual Memory**

- Virtual memory references are generated by the compiler
- Physical memory is shared between many processes.
- Physical memory may be smaller than virtual memory.
- Need some mechanism to translate between virtual and physical memory.
- Need also a protection scheme to allow processes to reference only memory that belongs to them.

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

20

# **Virtual Memory**

 Page table is used to translate virtual memory to physical memory



M<

Copyright © 2012, Elsevier Inc. All rights reserved.

| - |      |      |  |
|---|------|------|--|
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      | <br> |  |
|   | <br> | <br> |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   | <br> | <br> |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
| _ |      |      |  |
|   |      |      |  |
| _ |      |      |  |
|   |      |      |  |
|   |      |      |  |
| _ |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |
|   |      |      |  |

#### **TLB**

- Every memory reference takes 2 memory accesses.
- TLB is used to improve performance
- TLB is a small cache to store part of the page table

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

22

### **Virtual Machines**

- Supports isolation and security
- Sharing a computer among many unrelated users
- Enabled by raw speed of processors, making the overhead more acceptable
- Allows different ISAs and operating systems to be presented to user programs
  - "System Virtual Machines"
  - SVM software is called "virtual machine monitor" or "hypervisor"
  - Individual virtual machines run under the monitor are called "guest VMs"

M<

Copyright © 2012, Elsevier Inc. All rights reserved.

23

# **Impact of VMs on Virtual Memory**

- Each guest OS maintains its own set of page tables
  - VMM adds a level of memory between physical and virtual memory called "real memory"
  - VMM maintains shadow page table that maps guest virtual addresses to physical addresses
    - Requires VMM to detect guest's changes to its own page table
    - Occurs naturally if accessing the page table pointer is a privileged operation

M<

Copyright © 2012, Elsevier Inc. All rights reserved.