Tune and interpret decision trees for predicting capacity of #TidyTuesday wind turbines in Canada. Check out the code on my blog: juliasilge.com/blog/wind-turb...
hi Julia, I'm a huge fan of yours! Just a request for future consideration: an ML workflow with a at least one Python chunk. Would love to learn how you would blend R/Python together. Thanks for all of your great work.
Hi Julia, Great video as usual. Why did you not use the "workflow" this time? Also when would you typically choose to use that approach instead of the "non-workfow" one and vice versa?
Not over multiple *kinds* of models, as in different algorithms. You still need to set those up as separate tuning runs right now, but then you can pretty fluently compare then during the model evaluation phase, the way you compare different tuning options for the same type of model.
Two things come to mind for this. One is this section of our book which has an outline of the modeling process: www.tmwr.org/software-modeling.html#model-phases Another is this outline of what the different packages do: www.tidymodels.org/packages/
Hey Julia, Thanks for another useful screencast. Just a small doubt, while predicting finally using the workflow, I get the following error. I wonder, what could be the reason??? > final_res$.workflow[[1]] %>% + predict(turbine_train[44,]) Error: Workflow has not yet been trained. Do you need to call `fit()`?
Ah, there is a bug in the current version of tune on CRAN about this. If you can update tune from GitHub, this is fixed. (We are working on a new CRAN release for tune very soon.)
If I'm understanding your question correctly, you'll want to use `extract_fit_engine()` and then use any typical visualization such as rpart.plot(): parsnip.tidymodels.org/reference/extract-parsnip.html
You definitely could, especially the `fct_lump_n()` might be something you would want to learn from training data and then apply to testing data. We have to use good judgment in when to use recipes for a transformation vs. when to apply it before starting a modeling workflow (maybe even before splitting into testing and training data). The important things to think about are how information leakage may creep in, whether this is a statistical transformation that you want to learn from one data set and apply to others, whether this is a deterministic transformation that isn't affected by that kind of thing, etc. Some of these here are a bit in a gray area. You can read more about related issues here: www.tmwr.org/recipes.html#skip-equals-true
@@JuliaSilge Thank you for such a thorough response. I've been working through your book (tmwr) with Max Kuhn and I just searched "tidymodels r tutorials" to get my hands a little dirty when I found your videos. Thank you again!