Episode 69 of the Stanford MLSys Seminar “Foundation Models Limited Series”!
Speaker: Aakanksha Chowdhery
Abstract:
Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, dense Transformer language model at Google, which we refer to as Pathways Language Model (PaLM). In this talk, we discuss the system considerations and model improvements necessary to train the PaLM model across 6144 TPU v4 chips using Pathways at very high efficiency levels. Next we share how scaling the model to 540B parameters results in state-of-the-art few shot learning results across hundreds of language understanding and generation benchmarks. We will also share some of the more recent works built on top of PaLM that push the SOTA in various domains and democratize access to natural language processing.
Bio:
Aakanksha has led the effort on training large language models at Google Research which led to the 540B PaLM model. Aakanksha has also been a core member of the Pathways project at Google. Prior to joining Google, Aakanksha led interdisciplinary teams at Microsoft Research and Princeton University across machine learning, distributed systems and networking. Aakanksha completed her PhD in Electrical Engineering from Stanford University, and was awarded the Paul Baran Marconi Young Scholar Award for the outstanding scientific contributions in the field of communications and the Internet.
Check out our website for the schedule: mlsys.stanford.edu
Join our mailing list to get weekly updates: groups.google....
18 сен 2024