Fangbo Zhu (Senior Thermal Expert & TPM) - Alibaba
Xiguang - Henry Wu (Product Marketing) - Broadcom
The rapid evolution of AI training and inference is driving up compute and interconnect density- leading to higher power density per rack and increasing the challenge of managing heat dissipation. This presentation will explore the efforts to address these challenges from a networking perspective- ranging from the switch silicon component to the system level.
We will discuss Alibaba‚ as 51.2Tbps AI switches- which utilize air and cold plate cooling- detailing design philosophies- test results- and deployment experiences. Additionally- we'll examine improvements in power efficiency across generations of switch silicon and enhancements in interconnect technologies that link xPUs more efficiently and resiliently. This case study provides key insights into thermal design strategies and practical experiences vital for developing high-density AI clusters.
21 окт 2024