Article

M31 on 12 FFC Multi-Port Register Array

M31 talks about 12 FFC Multi-Port Register Array

A multi-port register file, simply refers to an array composed of multiple registers within a processor, also known as a multi-port register array. It is a common type of memory used in processors such as central processing units, embedded processors, or neural network processors, to temporarily store instructions, data, and addresses. The register array has a very limited storage capacity, but it has the advantage of being fast to read and write. Therefore, in processor architecture, a register array is used to store intermediate results of computation in arithmetic units, utilizing fast data access to support arithmetic units and improve processor computational efficiency.

There are two ways to implement a multi-port register file. The first way is to use logic synthesis to construct latches or flip-flops from the Standard Cell Libraries. The second way is to use customized multi-port static random access memory (SRAM) cells to implement a customized register array. These multi-port storage cells have dedicated read and write ports, which can support multiple simultaneous storages. Compared to the first way, the advantage of the second way is that it can provide better area and performance, while the first way can quickly complete the design and implementation of the register array.

In order to customize the implementation of multi-port register array, M31 has proposed an innovative time-division multiplexing technology, which optimizes the control unit of the register array and effectively improves the storage and read bandwidth of the memory cells, allowing the design and implementation of multi-port register array using standard SRAM cells (e.g. single-port memory cells, dual-port memory cells) provided by the foundry. This innovative technology not only can shorten design time and delivery processes, but also provides the best optimized area and performance for multi-port register array, helping customers to complete more competitive processor designs.

M31 provides comprehensive design solutions using TSMC’s 12nm advanced process. To meet the market’s demand for ultra-high speed operations, M31 offers 4-port SRAM with 2R2W operation modes, where each port operates independently without interfering each other. Moreover, M31’s memory team uses masking techniques for read and write operations to ensure stable data transfer during operation, while also addressing the CPU/NPU’s requirements for parallel processing data reads and writes. The customized memory module block is based on the SRAM bit cell architecture provided by the foundry, significantly reducing the area and power consumption compared to traditional latch or flip-flop implementations. Additionally, M31 has also developed a boosted amplifier architecture to address the limits of the bit cell’s self-reading speed, which further satisfies the need for ultra-high frequencies above the megahertz level within the processor.

Diagram: 2R2W Register File in Multi-core Computing Structure