JMM 2018: Tamara G. Kolda, Sandia National Laboratories, gives the SIAM Invited Address on "Tensor Decomposition: A Mathematical Tool for Data Analysis" on January 11 at the 2018 Joint Mathematics Meetings
What a great talk! I was watching tensor decomposition videos on youtube nearly the whole day, but no video was as good and clear as this one. I recommend everyone to watch this! :)
Great talk! Regarding the question at the end of the talk about tensor analysis for deep learning, one can look for the paper "On the Expressive Power of Deep Learning: A Tensor Analysis "
This is a great video! Tamara is a very intelligent person! The thing is, at my previous job I spent 80% of my day verifying and cleaning the data I acquired before I could load it into my tables to do my analysis. External feeds would get truncated. People put information in the wrong field. System logic filtered out information it believed was duplicate information. People on a shared drive overwrote tables or changed criteria in queries I only had read only access to so I couldn't get the right information. File conversions caused data corruption. There were many issues that required attention before I could load the data into my own tables to run my own queries and do my own analysis. When you have errors like I mentioned you'll be running down all kinds of anomalies before you get to the real problems no matter how well your models are set up in your analysis. For example, the mice that were studied for two years. They received treats and their neurons were tracked for two years. Two years is a long time to study something. Was the same treat given for the full two years with the same ingredients? Did the mice develop any cognitive impairments during that time? Was there any change in the lab during that time like new paint or remodeling that might affect the sense of smell on the mice? I'm not going to ask if the data was collected at the same time, verified, and if it was input into the system correctly. There are many steps in research, measuring, collecting, inputting and analyzing and and misstep can throw things off downstream in a large way. Obviously you want your models to be as precise as possible, but you need your data to be correct too and in my experience in working with large datasets most data is flawed making most statistics flawed, but it's not the fault of the model.