Very nicely done sir. As you've mentioned, this process becomes a loop, with drift analysis following the initial implementation. Do you cover drift in one of your presos? Thanks!
Hi Dave, great video. Regarding data Preparation, I believe that train/test split should come before missing values imputation, otherwise there would be a data leakage from the test set. Do you agree?
Hi Christopher, thanks for your comment. It depends on how you impute the data, but generally creating a train/test split is done later. It would also be good practice to select the test set in such a way that there are no imputed values (or dropping any row with missing values). Again, depending on how much of the data is missing. If you have enough data and a small percentage of missing values, dropping rows with missing values typically makes the most sense.