Artificial Intelligence, Machine Learning and High Performance Computing
For many years the Central Processing Units (CPU) with dedicated memory were at the center of High Performance Computing architectures (HPC). Conventional electrical interconnects like PCIe provided enough bandwidth between closely spaced chips and incremental link improvements were sufficient to advance overall system performance.
However, Artificial Intelligence (AI) and Machine Learning are rapidly becoming a major part of HPC. The Department of Energy estimates that demand for AI Training is doubling every 3-4 months. With this shift the HPC architectures are evolving into complex clusters of CPUs, Graphical Processing Units (GPU) and Data Processing Units (DPU) and dispersed memory blocks. All these different Processing Units (xPUs) are relying on a vast mesh of interconnects exchanging large amounts of data with low latency. In this new architecture the conventional electrical interconnects between the different xPUs have become the critical bottleneck for overall system performance.
This is where Avicena’s LightBundle technology is offering a paradigm shift based on a vast improvement in bandwidth density, power efficiency and reach.
The High Performance Computing (HPC) community has long recognized the value of a new class of interconnects to support future disaggregated computer networks. Optical interconnects have always been the prime contender to replace conventional electrical links which suffer from inherent signal degradation with reach, as well as power and cost challenges. Indeed, optical interconnects have replaced electrical links everywhere in communication networks down to the reach of a few meters. However, up to now the available optical technologies have not been practical for shorter links typically found in computer networks because they were too bulky, challenging to integrate into existing architectures and could not tolerate the high temperature environment of high-performance silicon ICs.
The Avicena LightBundle solution is fundamentally different. Based on microLEDs instead of lasers the power efficiency is improved up to 100x. The transceiver units can tolerate temperatures up to 125°C and are CMOS compatible which greatly facilitates integration with the chips they are interconnecting, either as chiplets or monolithically.
All these characteristics put Avicena’s LightBundle interconnects in the unique position of unleashing the full potential of the ever more complex clusters of CPUs, GPUs and DPUs found in modern HPC systems. While HPC systems will be some of the biggest beneficiaries of this new class of high performance interconnects, many computing and communication systems will benefit from the option for disaggregation of different functions. Chassis disaggregation is of great interest to enable more modular, scalable systems. Disaggregated 1RU and 2RU “Pizza Boxes” are found throughout the compute and communications industry today.
So far we have only discussed the impact of this new class of interconnects on existing system architectures. Now, let’s imagine what could be possible, once system architectures are designed around this new class of interconnects that allows connectivity of up to 10Tbps/mm2 with a reach of up to 10m at a fraction of today’s power dissipation. The possibilities are endless!