I put a lot of effort into this one to make it as descriptive as possible. It's also a new style of delivering content / animation. Please let me know how you like this. :)
I’m taking a masters in data analytics/program evaluation, and am learning this rn. You summarize the information really well, picking out the really important parts of causal inference to explain. Good job! The later part of the video even helped me conceptualize quasi experimental designs, which use matching like you described. Thanks for the help.
Absolutely beautiful, incredible explanation; I like that it's explained through a practical example! You're very underrated; the future of this channel is bright!
Gemini: The video is about causal inference. It explains what causal inference is and the challenges of performing causal inference using observed data. It also explains different techniques to address these challenges. The video starts with explaining randomized controlled trials (RCTs) which is the gold standard for causal inference. But RCTs are not always possible. So the video talks about causal inference using observed data. Causal inference using observed data is challenging because there can be confounding variables that affect both the treatment and the outcome. The video uses an example of a medical trial for the flu cure to illustrate this point. In the example, age is a confounding variable. The treatment group (people who received the elixir) has an average age of 35 while the control group (people who did not receive the elixir) has an average age of 65. Even if the people in the treatment group recover from the flu faster than the people in the control group, it might be because they are younger, not because of the elixir. Another challenge of causal inference using observed data is selection bias. Selection bias happens when the group chosen for the treatment is not representative of the population. For example, if the people who received the elixir in the medical trial were all young and healthy people, then the results of the trial would not be generalizable to the whole population. The video also talks about counterfactuals, which are what would have happened if a person had not received the treatment. Counterfactuals are necessary to estimate the causal effect of the treatment. There are two techniques for estimating counterfactuals: matching and machine learning. Matching involves finding people in the control group who are similar to the people in the treatment group on all observable characteristics except for the treatment. The outcome of the people in the control group can then be used as an estimate of the counterfactual for the people in the treatment group. Machine learning can also be used to estimate counterfactuals. A machine learning model can be trained on data from people who did not receive the treatment. The model can then be used to predict what would have happened to the people in the treatment group if they had not received the treatment. The video then talks about the assumptions that need to be made for causal inference using observed data. These assumptions are necessary to make the analysis possible. One of the assumptions is called the causal Markov condition. This assumption says that the treatment only affects the outcome through the variables that are included in the causal graph. Another assumption is called SUTVA (Stable Unit-Treatment Value Assumption). This assumption says that the outcome of a unit would be the same no matter what treatment the other units receive. The last assumption is called ignorability. This assumption says that there are no confounding variables that have not been included in the analysis. The video then shows how to calculate the average treatment effect (ATE) and the conditional average treatment effect (CATE). The ATE is the average difference in the outcome between the treatment group and the control group. The CATE is the average treatment effect for a specific subgroup of the population. In the example of the medical trial, the ATE was 0.1. This means that the people who received the elixir were more likely to recover from the flu than the people who did not receive the elixir. However, the CATE for people over the age of 35 was 0.4, while the CATE for people under the age of 35 was -0.2. This means that the elixir was effective for older people but not for younger people. The video concludes by saying that causal inference using observed data can be a powerful tool for making decisions, but it is important to be aware of the challenges and assumptions involved.
Amazing explanation! It must've been almost painful to not discuss all the details and caveats and technicalities, but that's what made it valuable for me Love the music as well :D
Thanks a lot for this video! Keep up the good work, and please try to cover Causal Graphs (Directed Acyclic Graphs) vs Bayesian Network structure learning(also in detail) if you can. Thanks in advance.
I am a causality denier! I don't believe in causality. At least not the causality that we are familiar with. I think we need higher-order logic of at least the 69th degree to come up with an explanation for causality. I don't wear a tinfoil hat. I wear a quantum metamaterial protective helmet.
The counterfactuals seem questionable... Is it really reasonable to say Sam would not get better with the treatment if he did get better without the treatment? That seems highly unlikely, doesn't it?...and the inverse for Rondo seems highly unlikely as well... I'm admittedly clueless about statistics but I'm always on the lookout for bad logic and this was a red flag for me. I don't mean to suggest a bad example on your part but rather that, in general, it seems there is a huge opening for error to sneak in through counterfactuals.
All of calculations are simple and clear but there is lack of a key element, which you mention at 11:31, namely how to estimate missing data. Could you send a link to an explanation of this element of the presentation?
Your presentation is missing a key element, which you mention at 11:31, namely how to estimate missing data. Could you send a link to an explanation of this element of the presentation?
great explanation, i've been studying c.i. for the past 6 months and your way of explaining was very clear. Cheers from Bolivia. P.S. can you share your discord link again plz
hello, what is the problem with the following approach which aims to account for age without counterfactuals? you can do mean(treatment) - mean(control) for the older group ((0+1+1)/3 = .67) - ((1+0)/2 = .5) resulting in a difference of .17 for the older group and a similar calculation for the younger group yields ((1+0)/2 = .5) - ((1+0+0)/3 = .33) resulting in a difference of .17 for the younger group as well. using this approach, there does not seem to be a difference due to age!
Very useful video. I spent two days reading the actual paper of causal influence. This video is concise but gives me a very good foundation to read the theory.
Hey Ajay, thanks a lot for making this video. Super helpful. Best video I came across on Causal inference. I have a question regarding Balanceness check between treatment and control group. Is it necessary to satisfy the balance criteria if I am using a ML model to predict the counterfactuals? Is it okay if there’s no balance between some confounders in Treatment and control group? Would really appreciate helping with this.
at 10:06 you mention that the age differences was large enough to warrant age to be labeled as a confounding variable. what exactly was the magnitude of difference that leads to that assumption? if the age means were 35 and 40, would that be a large enough difference? thanks.
wow!!!! your explaination is better than my epidemiology professor. thanks a lot!!! By the way, is there any recommand paper for RCT design r about Causal Inference ?
Thank you! As for specific resources, i put them in the description of the video. I don't think there is a single research paper that is the one size fits all for the topic, but a collection of these resources does paint a good picture. Also the next video's description had other resources from a Machine Learning perspective