The new optoelectronic co-packaging technology may replace electrical interconnect devices in Datacenters, significantly improving the speed and energy efficiency of AI and Other computing applications.
Peking, December 12, 2024 /PR Newswire/ -- Recently, IBM (NYSE: IBM) released its groundbreaking research results in optical technology, which is expected to significantly enhance the efficiency of training and running generative AI models in Datacenters. IBM researchers have developed a new generation of co-packaged optics (CPO) process that achieves light-speed connections within Datacenters through optical technology, providing a strong supplement to existing short-distance optical cables. By designing and assembling the first successfully announced Polymer Waveguide (PWG), IBM researchers demonstrated how optoelectronic co-packaging technology will redefine high-bandwidth data transfer between chips, circuit boards, and Servers in the computing Industry.
IBM optics module
Today, fiber optic technology has been widely used for long-distance high-speed data transmission, achieving "using light instead of electricity" to manage almost all commercial and communication transmissions globally. While the external communication networks of Datacenters have adopted fiber optics, the internal racks still primarily use Copper wires for communication. GPU accelerators connected by wires may be idle for more than half the time during large distributed training processes, needing to wait for data from other devices, resulting in high costs and energy waste.
IBM researchers have discovered a new method to introduce optical speed and capacity into datacenters. In their latest published paper, IBM demonstrated its globally first, high-speed optical connection enabling optoelectronic co-packaged prototype. This technology can significantly improve communication bandwidth in datacenters, minimize GPU downtime, and greatly accelerate AI workloads. This innovation will achieve the following new breakthroughs:
- Reduce the cost of large-scale generative AI applications: compared to medium-distance electrical interconnect devices, energy consumption is reduced by more than 5 times, while extending the length of datacenter interconnect cables from 1 meter to hundreds of meters.
- Increase the speed of AI model training: compared to traditional wiring, training large language models using optoelectronic co-packaged technology is nearly five times faster, reducing the training time of standard large language models from three months to three weeks; for larger models and more GPUs, the performance will see even greater improvements.
- Significantly improve the energy efficiency of datacenters: with the support of the latest optoelectronic co-packaged technology, the amount of electricity saved for training one AI model is equivalent to the total annual electricity consumption of 5,000 American households.
Dario Gil, Senior Vice President and Director of IBM Research, stated: "Generative AI requires more and more Energy and processing power, and Datacenters must be upgraded accordingly. Photonic integration technology can help Datacenters confidently face the future. With breakthroughs in photonic integration technology, fiber optic cables will significantly enhance the data transmission efficiency of Datacenters, and communication between chips and processing of AI workloads will also be more efficient. We are entering a new era of communication that is faster and more sustainable."
80 times faster than the existing bandwidth for chip-to-chip communication.
Thanks to advances in chip technology in recent years, more dense transistors can be accommodated on chips; for instance, IBM's 2-nanometer chip technology can embed over 50 billion transistors on a single chip. Photonic integration technology aims to expand the interconnection density between accelerators, helping chip manufacturers to add optical pathways for Connection Chips on electronic modules, thus exceeding the limitations of existing electronic pathways. The new high bandwidth density optical structure described in IBM's paper and other innovative results, such as transmitting multiple wavelengths through each optical channel, are expected to increase the communication bandwidth between chips to 80 times that of wired connections.
Compared to the current state-of-the-art photonic integration technology, IBM's innovative results could allow chip manufacturers to increase the number of optical fibers at the edge of silicon photonic chips by six times, known as "beachfront density." Each optical fiber is approximately three times the width of a human hair, with lengths ranging from several centimeters to several hundred meters, capable of transmitting data at terabit per second levels. IBM's team uses standard packaging processes to package high-density polymer waveguides (PWG) on optical channels with a 50-micron pitch and adiabatically couple them with silicon photonic waveguides.
The paper also pointed out that the aforementioned optoelectronic co-packaged modules use polymer optical waveguides with a spacing of 50 micrometers, which has passed all the pressure tests required for manufacturing for the first time. These modules need to withstand high humidity environments, temperatures ranging from -40°C to 125°C, and mechanical durability tests to ensure that the optical interconnect devices do not break or lose data even when bent. In addition, researchers also demonstrated a polymer optical waveguide technology with a spacing of 18 micrometers: stacking four polymer optical waveguide devices together can achieve connections of up to 128 channels.
IBM continues to lead in the research and development of semiconductor technology.
In the face of the growing demand for AI performance, optoelectronic co-packaging technology has opened up a new communication avenue and may replace the external communication of modules from electronic to optical. This technological breakthrough continues IBM's leadership in semiconductor innovation, including the world's first 2-nanometer chip technology, the first 7-nanometer and 5-nanometer process technologies, nanosheet transistors, vertical transistors (VTFET), single-chip DRAM, and chemically amplified photoresists.
The design, modeling, and simulation work for this project were completed in Albany, New York, USA, while the prototype assembly and module testing were undertaken by IBM labs located in Bromont, Quebec, Canada, which is one of the largest chip assembly and testing bases in North America.
[1] Reduced from 5 microjoules per bit to less than 1 microjoule. [2] Data based on training a 70 billion parameter large language model using industry-standard GPU and interconnection devices. [3] Data based on training a super-large language model (such as GPT-4) using industry-standard GPU and interconnection devices. |
About IBM
IBM is a global leader in hybrid cloud, AI, and enterprise services, helping customers in over 175 countries and regions extract business insights from their data, streamline business processes, reduce costs, and gain competitive advantages in their industries. More than 4,000 government and enterprise entities in critical infrastructure sectors such as financial services, telecommunications, and medical rely on the IBM hybrid cloud platform and Red Hat OpenShift to achieve digital transformation quickly, efficiently, and securely. IBM's groundbreaking innovations in AI, quantum computing, industry cloud solutions, and enterprise services provide our clients with open and flexible options. A long-term commitment to business integrity, transparent governance, social responsibility, inclusive culture, and a spirit of service is the cornerstone of IBM's business development. For more information, please visit:
Media Contact
Cui Shoufeng, shou.feng.cui@ibm.com
IBM Corporation logo.