IP cutting-edge core Yaohui breaks the bottleneck of DDR PHY technology at multiple points
“In recent years, the rapid development of cloud computing, 5G, Internet of Things, artificial intelligence and other industries has greatly increased the demand for memory. As a key module of memory technology, the market demand for DDR PHY is also growing rapidly. From the perspective of Xinyaohui, a cutting-edge IP company, this article talks about DDR PHY and Xinyaohui’s technological breakthroughs in DDR PHY to help service chip design companies.
In recent years, the rapid development of cloud computing, 5G, Internet of Things, artificial intelligence and other industries has greatly increased the demand for memory. As a key module of memory technology, the market demand for DDR PHY is also growing rapidly. From the perspective of Xinyaohui, a cutting-edge IP company, this article talks about DDR PHY and Xinyaohui’s technological breakthroughs in DDR PHY to help service chip design companies.
What is DDR PHY
DDR PHY is a bridge for communication between DRAM and memory controller. It is responsible for converting the data sent by the memory controller into signals conforming to the DDR protocol and sending them to the DRAM; on the contrary, it is also responsible for converting the data sent by the DRAM into conforming signals. DFI protocol signal and sent to the memory controller. The DDR PHY and memory controller are collectively referred to as DDR IP, and they ensure data transfer between the SoC and the DRAM.
Strong market demand for DDR IP
As an important interface IP, the market demand for DDR IP is strong. According to the prediction of IP Nest, the global interface IP will maintain an average compound annual growth rate of 16% from 2015 to 2024. In the next few years, in the market share of five major types of interface IP (USB, PCIe, DDR, D2D&Ethernet, MIPI), DDR IP will continue to maintain the top three market shares.
At present, in the DDR IP market, international manufacturers occupy a relatively high market share, while domestic IP companies account for a small proportion. The reason is mainly because DDR PHY has a high technical threshold, and it is necessary to achieve breakthroughs in this type of PHY. Not easy.
First of all, rather than saying that DDR PHY is a chip technology, it is better to say that DDR PHY is a system engineering. The data transmission of DDR adopts the transmission mode of parallel multi-bit and single-ended burst, which has high requirements on power integrity PI (Power Integrity, power integrity) and signal integrity SI (Signal Integrity, signal integrity). On the other hand, DDR can be said to be the most demanding interface for training. Whether various trainings achieve the best results directly affects the reliability of DDR work. For PHY developers, it is necessary to understand both the design of the physical layer and the design of the training algorithm. Only in this way can reliable products be developed. However, this invisibly raises the threshold of design. Finally, how to achieve high-speed single-ended signal transmission is a major test of DDR IO design.
More efforts to overcome the bottleneck of DDR PHY technology
As a high-tech company focusing on semiconductor IP research and development and services, Xinyaohui Technology has identified the needs and market opportunities of enterprises, through reliable SI and PI analysis, optimized training algorithm design, high-performance IO design and a series of Technological innovation has successfully broken through the technical bottleneck of DDR PHY.
Key technical point 1: Reliable SI and PI analysis guidance
The characteristics of DDR data transmission are: multi-bit parallel transmission, single-ended data burst mode. At present, SoC can integrate up to 72-bit (DDR4 with ECC) DDR interface. The wiring of multi-bit parallel transmission on the package and PCB is very complicated. Many traces have certain equal length requirements, and at the same time, it is necessary to minimize the number of wires between them. Crosstalk, so qualified package and PCB design is a big challenge. In addition, in burst mode transmission, SSO (Simultaneous Switching Output) noise will also seriously affect the performance of DDR. So DDR stable work requires reliable SI and PI analysis.
In the early stage of chip development, it is very important to determine the PAD planning and packaging planning of the chip for optimizing the SI and PI performance of the DDR in the later stage of the design. Xinyaohui conducts SI and PI analysis in the early stage of system-on-chip design and IO preparation stage to help customers plan in advance to ensure the mass production performance of the integrated DDR PHY.
In addition, the Xinyaohui team has also developed a set of special code stream analysis technology. Through this technology, it is possible to efficiently analyze whether the package and PCB design meets the requirements of the DDR eye diagram in the design stage, quickly locate defects, and guide customers to optimize and improve.
Key technical point 2: High reliability training design
The stable work of the DDR system is inseparable from various trainings. At startup, a series of trainings such as initial CA Training, Write Leveling, Read Leveling and Write Eye Training are required. For higher protocols of DDR4, LPDDR4 and above, two-dimensional training of VREF is also required. A purely hardware-based approach cannot provide complex training paradigms. For example, the DDR4 protocol of JEDEC stipulates that DRAM can only provide simple paradigms such as 01010101, which is not enough for high-speed DDR training, because these paradigms have a single frequency and cannot reflect the intersymbol interference (ISI) caused by data channel attenuation. In addition, different paradigms will have different reflections at the terminal. Therefore, if the simple paradigm specified by JEDEC is used to train the DDR, especially at a higher rate, an optimal training result cannot be obtained.
Xinyaohui’s DDR PHY adopts a firmware-based training method, which can set different paradigms, such as PRBS paradigm, specially designed frequency sweep paradigm, etc. Obviously, this kind of paradigm can reflect the characteristics of the data channel more comprehensively, because it contains high-frequency, intermediate-frequency, low-frequency information, as well as intersymbol crosstalk caused by long 0 and long 1, which can ensure better training results.
After the initial training is completed, the internal temperature and voltage of the chip will change with the working state and ambient temperature. The changes in temperature and voltage will cause the training result to deviate from the ideal value, reducing the read and write margin of the DDR. In severe cases, it can also cause read and write data errors. Xinyaohui has developed a technology that can dynamically detect changes in temperature and voltage inside the chip. By compensating for various training results in real time, it ensures that data reading and writing has sufficient margin and ensures the stability of DDR operation.
Key technical point 3: high-performance DDR IO design
Signal reflection caused by crosstalk between signals and impedance mismatch of traces seriously affects data communication. In order to ensure the reliability of DDR data reading and writing, in the DDR IO design, Xinyaohui adopts FFE (Forward Feedback Equalization) and DFE (Decision Feedback Equalization) technology.
FFE front-end pre-equalization
FFE front-end pre-equalization is a technology used on the DDR TX side. Because the data channel is attenuated, the high frequency part of the signal is suppressed more, and the low frequency part is suppressed less, so the eye height and eye width of the eye diagram seen at the RX end are relatively small. The idea of FFE is to reduce the energy of low-frequency components, so that the high-frequency and low-frequency parts of the signal are equalized after the channel. If the signal has a change of 0->1 or 1->0, the signal with full strength (Full Strength) is output, and if the signal is continuous 1 or 0, the signal with equalized strength (EQ Strength) is output.
On the RX side, when the data rate is 6400Mbps, the simulation diagram of closing FFE and opening FFE. It can be seen that the eye diagram quality with FFE turned on is significantly better than that with FFE turned off.
Xinyaohui adopts a programmable front-end pre-equalization scheme. Different equalization effects can be obtained by setting different parameters to meet the needs of various application scenarios.
Receiver DFE (Decision Feedback Equalization) supported by adaptive algorithm
The intersymbol crosstalk of a signal can be understood from a schematic diagram of a pulse response.
When the pulse signal passes through the channel, a trailing waveform will be formed due to high frequency attenuation and channel reflection, and the signal of the previous bit will affect the signal quality of the future bit. The principle of DFE is to judge whether the signals of the previous bits are 1 or 0, and then add them through weighting and feedback to reduce the smearing effect of the previous bit signal, so as to improve the quality of the current bit signal. Compared with equalization technologies such as CTLE, DFE does not amplify noise signals. Therefore, the Solid State Technology Association officially introduced DFE technology in the JEDEC79-5 specification to enhance the capability of the receiving end.
The common 4 tap DFE architecture is also one of the architectures recommended by the JEDEC specification. Because both the rising and falling edges of DQS will sample DQ, the sampling circuit is divided into upper and lower data paths. The four sampled values of the two data paths are processed by the weighting coefficients and then fed back to the summer (Σ) corresponding to each data path, thereby subtracting the ISI effects of the four previous signals on the current signal. This structure uses two summers, which will increase the load on the DQ_Buf terminal. The other four sampled values all need to be directly fed back to the two summers, which will make the internal wiring of the chip more complicated and affect the high-speed performance. Another structure of DFE, this structure selects the sampling value of two data paths through MUX, and sends the selected value to the summer for EQ processing. Because only one summer is used, the wiring complexity inside the chip is reduced, and the most important thing is to reduce the load on the DQ_Buf terminal and improve the high-speed performance.
The weighting coefficients of taps at all levels of the DFE can be manually set. The prerequisite is to obtain the channel parameters. This is not suitable for mass production of products, because for different products, its IO characteristics and channel parameters are random. Deviations, the same set of settings does not guarantee optimal DFE performance for every product. Obtaining the coefficients of taps at all levels of DFE through adaptive training is the current mainstream method. Xinyaohui’s DDR PHY provides a special firmware training mechanism. The feedback coefficients of DFE taps at all levels can be quickly obtained through training, with a high degree of self-adaptation, which can ensure that each chip has better DFE performance and is effective. Reduce the effects of intersymbol interference and reflections.
Key technology point 4: Fast frequency switching technology that supports multiple frequency points to achieve low power consumption design
DDR is a major power consumer in SoC systems. How to reduce the power consumption of DDR has always been one of the driving forces and directions of DDR technological innovation. The most direct way is to reduce the supply voltage, and this is the evolution of the DRAM specification. On the other hand, starting from DDR4 and LPDDR4, the DRAM specification defines POD IO architecture (for DDR4 and DDR5), LVSTL IO architecture (for LPDDR4 and LPDDR5) and data bus inversion (DBI) technology, which can effectively reduce the function of the IO side. consumption.
The above method of reducing power consumption is a technology limited by JEDEC specifications. Xinyaohui has also developed a dynamic frequency switching technology, which can effectively reduce the total power consumption of the system. This technology can train the configuration of up to multiple frequency points when the DRAM is initialized, and save the relevant training results. When the system determines that the DRAM does not need to work at a high frequency, it can notify the DDR controller, and then the DDR controller will notify the DFI and let the DRAM enter the self-refresh state. After that, the frequency switching will be automatically performed inside the DFI and DDR PHY. After completion, the DDR controller will let the DRAM exit self-refresh, so that the DDR can switch to a lower operating frequency, thereby reducing power consumption. Compared with similar products, the biggest feature of this technology is that the whole process does not require firmware access, and there is no need to retrain at new frequency points, so that frequency switching can be achieved quickly and stably.
In the future, the market demand for DDR PHY will continue to grow, and the demand for advanced processes will become more prominent. Xinyaohui entered the IP development based on FinFET process earlier, and through continuous technological innovation, it has become one of the few local companies that can provide advanced process, superior performance, stable and reliable DDR PHY.
A hundred feet of success, and a step further, Xinyaohui people will definitely take providing high-performance interface IP and high-quality design services as their own responsibility, work hard, and help cooperate with the vast number of chip design companies and wafer foundries to launch better. products to help enhance the development of China’s chip industry.