Dan Klein (UC Berkeley)
simons.berkele...
Large Language Models and Transformers
I'll talk about three major tensions in NLP resulting from rapid advances of large language models. First, we are in the middle of a switch from vertical research on tasks (parsing, coreference, sentiment) to the kind of horizontal tech stacks that exist elsewhere in CS. Second, there is a fundamental tension between the factors that drive machine learning (scaled, end-to-end optimization of monoliths) and the factors that drive human software engineering (modularity, abstraction, interoperability). Third, modern models can be stunning on some axes while showing major gaps on others -- they can, in different ways, simultaneously be general, fragile, or dangerous. I'll give an NLP perspective on these issues along with some possible solution directions.
26 сен 2024