Ellie Pavlick is a wonderful scientist and she is right by all accounts: understanding LLM models is mostly an empirical undertaking (and a very important one at that!). Thank you for that detailed and nuanced discussion.
The levels of analysis stuff (51:00 ff.) seems reminiscent of an argument that David Marr and Thomas Poggio first advanced in the mid-1970s: Complex information processing systems, whether in computers or brains, need to be analyzed on several levels such that the system at level N can be said to be implemented in a somewhat different system at level N-1. Thus, natural language syntax might be implemented one way in the neural networks of the human brain, but a somewhat different way in the artificial nets of LLMs, and in still a different way in a "classical" symbolic computing model of the 1970s or so. That's what seems to be going on in some work I've done on how ChatGPT produces simple stories.