http://bee2.eecs.berkeley.edu/img/BEE2_logo_mini.png

Overview

DDR2 Memory System


Created: 2006-06-09


  1. Introduction
  2. Physical Memory
  3. DDR2 Controller
  4. User Memory Interface
  5. Frequently Asked Questions
  6. Revision History

Introduction


The purpose of this document is to give an overview of the DDR2 memory system supported by the BEE2. In addition to a brief overview of the physical memory support on each BEE2 module, this document also describes the custom memory interface logic created for the BEE2 at the Berkeley Wireless Research Center.

Overview


Access to the raw DDR2 memory has been implemented through a set of interfaces which build upon one another. The following figure gives a high level implementation of this layered approach.

At the lowest level is the actual DDR2 memory controller. Each DIMM has its own independent DDR2 controller, each of which is heavily pipelined and supports simple bank management. On top of the low-level DDR2 controller is the user memory interface. The user memory interface is a simple, common interface that user logic can use to access memory. The interface is implemented as a series of asynchronous command FIFOs which provide command buffering and decouple the user clock domain from the 200MHz DDR2 controller clock domain. Finally, on top of the user memory interface, any arbitrary user logic can be implemented.

Multi-port Access


To provide multi-port access to each DIMM, we have provided a simple arbiter to the user memory interface for each DIMM. The switch is completely configurable in the number of ports, and has different modes of arbitration. The basic arbitration is priority round-robin. This gives a good balance between the needs of a priority based arbiter and provides starvation free arbitration. In addition to the basic mode, the arbiter also supports bursting. Bursting allows a requester to maintain arbitrated status while it holds down the request line up to a configurable window of cycles. This enables efficient bursting for applications which require bursting to maintain sufficient performance. Finally, in addition to these two modes of arbitration, the arbitration logic itself is highly self contained with a consistent interface, allowing users to easily implement their own arbitration scheme to suit their particular needs.

The multi-port switch provides each port the same user memory interface as is provided by the asynchronous command FIFO block. This means that any logic design to work directly with the user memory interface will work without modification with the multi-port switch. An example of a system which uses the multi-port switch can be seen below.

This example is similar to the setup of the reference Linux base system provided with the BSP. In this example the multi-port switch provides two ports. One is connected to a PLB attachment that allows the host processor to access the physical memory. The second port is attached to a frame buffer device (note that the OPB attachment for the frame buffer is not for data transfer, it is simply for control registers). In this example, bursting is turned on to meet the timing requirements of the framebuffer's DVI output.


Physical Memory


Each BEE2 module has five Virtex-II Pro 70 FPGAs, each of which are connected to four fully independent DDR2 DIMM modules. At present the maximum memory capacity of each module is 1GB (although higher density DIMMs could bring this to 2GB or above) giving a total memory capacity of 20GB per BEE2 module. The FPGA is wired to accept a 72-bit data path, allowing the use of ECC storage bits as well.

For more information on the details of the physical connections to memory, please see the BEE2 module documentation which contains links to schematics, and data sheets. Additionally the Bee2Setup page has information about part numbers for DIMMs that have been verified to work with the BEE2.


DDR2 Controller


Overview


The heart of the BEE2 memory system is the DDR2 controller. The controller is responsible for all of the low-level DRAM management data transfer tasks. The controller was originally based on the data path generated from the Xilinx MIG007 tool and then heavily modified at BWRC. The tasks performed by the controller include:

As of now, the controller allows access to ECC storage bits, but does not actually implement ECC. For more detailed information on the controller implementation, please see the core documentation directly for the ddr2_controller_v2_00_a core.

Interface


Although the DDR2 controller has independent data paths for read and write data, the controller does not allow simultaneous issuance of both read and write commands. However, the controller is highly pipelined and can issue a command every other cycle. The interface provided by the controller is fairly simple, although it runs at 200MHz and therefore can introduce timing problems if interacted with directly. The preferred way to interact with this interface is through the asynchronous command FIFO block (described below), but the lowest latency option is to use this interface directly. The following diagram shows the signals that comprise the interface.

Please note that although there are separate user_read and user_write signals, they may not be asserted at the same time, and therefore only a single read or write can be issued at a time.

Functional Description


Issuing a command to the controller is controlled by the user_ready signal. When this signal is high, it means that in that cycle it is possible to issue a new command. Because of the single cycle turn around on this signal, it often shows up on the critical path for this interface. The transaction happens in two phases. First the address is accepted, and second the data is fetched (for a write) or returned (for a read).

Phase 1: Address

When user_ready is high the user can assert either user_read or user_write (but not both) along with a valid 32-bit address. Note that although you present a 32-bit address, the lower 4-bits of the address will be unused because the controller transfers data aligned on 128-bit boundaries (actually it is 144-bits if you count the extra ECC storage bits). The transaction is accepted on the rising edge that sees both user_ready and user_read or user_write asserted. Ready will always go low for at least one cycle after accepting an address. An additional signal, user_half_burst is also sampled at issue time. If this signal is high, it means that the controller will only accept or generate one 144-bit data value. If it is low then it means the controller will accept or generate two 144-bit data values.

Phase 2: Data

Timing


The following timing diagrams describe the behavior of the DDR2 controller interface.

Write Transactions

If the user_half_burst signal is asserted when issuing the command, then the user_get_data signal will only be asserted for one cycle and the controller only expects one 144-bit data value.

Read Transactions

If the user_half_burst signal is asserted when issuing the command, then the user_data_valid signal will only be asserted for one cycle and the controller only sends one 144-bit data value.


User Memory Interface


Overview


Although interacting with the DDR2 controller interface directly is the highest performance, lowest latency option to access memory, most applications will have logic operating in a different clock domain than the 200MHz controller. To address this issue, we have provided a easier to use and clock decoupled interface called the user memory interface. The implementation of this interface is the async_ddr2_v2_00_a core which contains block RAM based asynchronous FIFOs to buffer commands and provide clock domain crossing.

The interface has several different modes of operation which are set by two parameters:

Finally, the asynchronous command FIFO core also give the user the option to utilize a command tag FIFO that will maintain a 32-bit tag that corresponds to each read transaction. This makes certain applications easier given the split phase operation of commands.

Interface


The following diagram shows all the signals that comprise the user memory interface.

The interface is comprised of roughly three components. The command (Cmd) signals are involved with issuing a read or write command. The write (Wr) signals carry data and byte enables for writes, and the read (Rd) signals carry data and tag information from reads.

Functional Description


Just like the DDR2 controller, the asynchronous command FIFO block does only allows a single read or write command per cycle and no reordering of commands is done within the block. This is an important point for consistency and coherence issues for multi-core systems, as this means that this core acts as a synchronization point for all commands. It is possible to augment this core to perform reordering and optimization, but at this time it is safe to assume the same order in and out of the core.

Phase 1: Command Issue

To issue a command the user will assert Mem_Cmd_Valid and will present a valid address on Mem_Cmd_Address. The cycle in which the command is accepted the core will assert Mem_Cmd_Ack (this may be asserted during the first cycle Mem_Cmd_Valid is asserted).

Phase 2: Read data

When the read data has returned from the controller the controller will assert Mem_Rd_Valid and the Mem_Rd_Dout and Mem_Rd_Tag buses will be valid. The user can acknowledge the data at any time by asserting Mem_Rd_Ack for a single cycle.

Special Case: Narrow data mode and full burst mode

The only case where the description above differs is when the core has been configured with C_WIDE_DATA=0 and C_HALF_BURST=0. This case corresponds to utilizing the full bandwidth of the DDR2 controller, but without using the internal block RAM muxes. In this case the user is required to issue two commands for a write and will need to acknowledge two read data values for a read.


Frequently Asked Questions


None so far.


Revision History


Click the Info link below to see the revision history.

Bee2Memory (last edited 2006-06-26 16:08:43 by alschult)