Тёмный
No video :(

Predict injuries for Chicago traffic crashes with tidymodels 

Julia Silge
Подписаться 15 тыс.
Просмотров 5 тыс.
50% 1

Download up-to-date city data from Chicago's open data portal and predict whether a traffic crash involved an injury with a tidymodels bagged tree model. This is planned to be the first in a series walking through how to approach model ops tasks using tidymodels and other R tools. Check out the code on my blog: juliasilge.com...

Опубликовано:

 

28 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 21   
@avnavcgm
@avnavcgm 3 года назад
Thank you yet again for this exceptional material Ms. Silge.
@sidharthadaggubati438
@sidharthadaggubati438 3 года назад
This channel deserves more views. High quality content. Thank you Julia
@HamJeong
@HamJeong 3 года назад
Incredibly useful, thanks so much, I really learn a lot from you sharing like this!
@hesamseraj
@hesamseraj 3 года назад
Once again thank you very much Julia. I watched and worked the coding of all your videos and will be following you as long as you share these fantastic videos.
@brendenmorley2643
@brendenmorley2643 3 года назад
Once again your tutorial is sooo insightful. My r knowledge continues to explode, due to your time and work.
@ochiwar
@ochiwar 3 года назад
Another Excellent tutorial! I love your plot theme/aesthetics. Will it be possible for you to share your ggplot template theme? Thanks!
@JuliaSilge
@JuliaSilge 3 года назад
I have it in a little personal package here -- theme_plex(): github.com/juliasilge/silgelib But there are some very similar themes in the hrbrthemes package (the one that uses IBM Plex): cinc.rud.is/web/packages/hrbrthemes/
@mattm9069
@mattm9069 3 года назад
thanks Julia!!!
@datasciencenerd3263
@datasciencenerd3263 3 года назад
I learn a lot from you thank you.
@prod.kashkari3075
@prod.kashkari3075 3 года назад
Hello Julia! Thanks so much for these tutorials and your book on tidymodels, I’m a undergrad who wanted to learn machine learning in R and you had great resources to help me get started. A few things I wanted to ask you about tidymodels based on what I’ve noticed recently when working with it. 1. I’ve been getting errors when trying to call the tune_grid() function, I have all my workflows setup, my recipe, I even prep and bake it to check to make sure it’s good, I create my cross validation folds and tuning grids yet when I call tune_model and pass in my workflow, resamples, and grid, it says that my models have failed, do you know what the source of this could be? It says on every fold that something failed. Also it is very slow and tends to freeze. 2. When I try and fit with my workflow object, by calling fit(), I get a message which says “error could not find fit function from workflow” so I solved the problem by attaching the parsnip:: in front of it and it worked fine, but this error came up one day randomly when I never experienced it the day before. These issues I’m sure are because tidymodels is so new and in development. Also as a request could you make more videos on the stacks package as well with building ensemble learners in tidymodels? Thanks!
@JuliaSilge
@JuliaSilge 3 года назад
In general, I'd recommend making sure your packages are up to date with the latest CRAN versions. If you can create a reprex with your problem and post on RStudio Community, we are happy to help find the solution: rstd.io/tidymodels-community
@terrencerussell1999
@terrencerussell1999 3 года назад
Hey Julia! Great stuff again here as always. I look forward to each one of your posts and follow along in R. When doing this one with my own Canadian Lat/longs I don't produce a map like yours did in Chicago is that a limit of the function for Canada coordinates? or am I missing something?
@JuliaSilge
@JuliaSilge 3 года назад
Hmmmmm, I haven't looked at data from Canada so I can't say for sure. If you can put together a small, self-contained reprex demonstrating the issue and post on RStudio Community, I bet folks will be eager to help. There is even a spatial tag where you can get interested folks to see: community.rstudio.com/tag/spatial
@terrencerussell1999
@terrencerussell1999 3 года назад
@@JuliaSilge Ok great will do! Thanks again
@UndecidedFellow
@UndecidedFellow 3 года назад
Thank you for the video Dr Silge! Quick question, how are `bag_tree()` and `vfold_cv()` functions accounting for the time series nature in the data? I'm reading the documentation and it looks like your current pipeline treats the dates as non ordinal and categorical, using the dates as factors with line `step_date(crash_date) %>%`. Is my reading correct? In short, why did you choose `vfold_cv()` over `rolling_origin()` and how is seasonality/autocorrelation modeled in your pipeline?
@JuliaSilge
@JuliaSilge 3 года назад
So this isn't time series in the sense that I want to predict the next crash(es). Instead it is a classification model where some of the predictors are date features. You can look at another example of this kind of model here: www.tidymodels.org/start/recipes/
@mattm9069
@mattm9069 3 года назад
Julia, can you please elaborate on what step_downsample() does once we get to the resampling steps? I wanted to see what I would get out of this code: train_preprocessed % prep(crash_train) %>% juice() I get a balanced dataset of the outcome variable, and it has ~45,000 rows. Yet, one cross validation fold has 138,000 rows for analysis. So, I want to understand what's happening conceptually. I've seen other people build the recipe from the original dataset, but we use the training set (i.e. recipe(injuries ~ ., data = crash_train))
@JuliaSilge
@JuliaSilge 3 года назад
Reading this section might help clear some things up for you: www.tmwr.org/recipes.html#skip-equals-true As well as the section a little bit further about row sampling steps like downsampling. A subsampling step like `step_downsample()` will downsample the analysis set of a CV fold but not the assessment set.
Далее
Lasso regression with tidymodels and The Office
44:49
Tuning random forest hyperparameters with tidymodels
1:04:32
Forecasting with the FB Prophet Model
20:42
Просмотров 80 тыс.
Logistic regression for US House election vote share
25:13
Tuning XGBoost using tidymodels
50:36
Просмотров 18 тыс.