Hutsons-Hacks

24
24 666

Комментарии

@bolajiadedasola6369 Месяц назад

Thanks for this. How can I use it for ordinal logistic regression using clm?

@dmitrykuznetsov1914 6 месяцев назад

Hi Hutsons Hacks. Could you please help with a couple of challenges that I met using prediction in caret for elastic nets. 1. how can I share the model without sharing the original training dataset? When I see the structure of the model, it seems that it always uses the original dataset even if it predicts in another dataset. 2. How can I apply the model for the datasets of another structure? So far, predict function demands the same number of predictors as in the initial dataset and not more not less. For me it seems it would be reasonable just to have the predictors that have weight not equal to 0. 3. How can I apply the predict for the testing dataset where one if the chosen in the training dataset predictors are missed. 4. why the predict function for glm does not produce the same predicted values compared to the sum of the each predictor multiplied by its weight.

@markelov 9 месяцев назад

Thank you so much for creating this package and for providing this video! I am wondering if you could explain (or point me toward some readings) for your point at 6:09 that a statistically significant intercept connotes omitted variable bias? I haven't heard this before and would love to learn more. -Todd

@upheaveworker2108 11 месяцев назад

Why does the regression model not have split data like in the classification model? Thank you for the answer

@christopherhayes699 Год назад

awesome, thank you!!!!! appreciate your amazing posts - super helpful, and easy to follow along.

@aram5704 Год назад

How can we rename the variables in odds_plotty? SO the original ones stay in data.

@enriquevalentin5245 Год назад

Thank you for the video!! How could you use the same library for multinomial logistic regression?

@christospapadopoulos7704 Год назад

Hello and thanks for this great information. My question is what's the next step after depolying a model to docker?

@brittnyfreeman3650 Год назад

So, you might want to host the docker image on and AWS ECS or EC2 container, so that other people could access your model from the web through their browser. That way anyone can access the model even if they don't have R installed on their computer.

@MultiAwesome95 Год назад

How can i automate the google URLs?

@techcode5433 Год назад

hii buddy.....your code working nice. but when i trying to download 300+ images code file running continusaly . i think google page have not 300+ images. but how many images have should be downloaded

@hutsons-hacks3668 Год назад

Yeah that is the problem with Selenium Google knows that you are running automation and blocks it if you try too many images. You haven't done anything wrong, just the general issue with Google.

@onewhoflutters4866 Год назад

I'm trying to use google lens to download images similar to the image I have. How should I change the code to auto download images in google lens ? I would love any help.

@Kev4ik Год назад

I'm getting the following error: "[22652:23504:0120/214725.867:ERROR:cert_issuer_source_aia.cc(34)] Error parsing cert retrieved from AIA (as DER): ERROR: Couldn't read tbsCertificate as SEQUENCE ERROR: Failed parsing Certificate [22652:26808:0120/214806.400:ERROR:util.cc(134)] Can't create base directory: C:\Program Files\Google\GoogleUpdater" "Failed parsing certificate" repeats itself for every image. What's the issue here?

@onewhoflutters4866 Год назад

Hello, How should I change the code to auto download images in google lens ? Can you give an idea? I used the link as in google image but it does not download.

@dle3528 Год назад

Just one question: You did the pre-processing (scaling) and balancing before splitting the dataset. Is this correct? I read that this process needs to be done after splitting the dataset to avoid leaks in the test dataset.

@hutsons-hacks3668 Год назад

Yes, that is the recommended approach.

@aifenchia3529 Год назад

Great tutorial!

@anifminhazkhan4143 Год назад

Great tutorial indeed. But unfortunately, the iteration over the images is going well but the images are not downloaded in the directory. :/

@Kev4ik Год назад

same here unfortunately

@pog_champ Год назад

Awesome video, works well! Just want to comment here the errors that you faced for my reference and others in the future 38:18 3 errors line 81 - path needs to end with '/' player_path = './images/nottingham_forest/' line 91 - forgot parameter 'url' urls = get_images_from_google(wd, 0.2, TOTAL_NUMBER_OF_EXAMPLES, url_current) line 94-97 - arguments outside of bracket download_image(down_path=f'images/nottingham_forest/{lbl}/', url=url, file_name=str(idx+1)+'.jpeg', verbose=True)

@hutsons-hacks3668 Год назад

Thanks man. I was live coding, but the code in GitHub runs.

@John-xk9ms Год назад

Great tutorial - worked for me on Mac! Just one question - is there a way to edit the code such that the photos downloaded are of a higher resolution (i.e., allowing the photo to load for a bit then downloading the image rather than the thumbnail?)

@hutsons-hacks3668 Год назад

Good question. You would have to actually go to the website where the image is hosted and extract the right <img> tag in HTML. This would make the code much less performant.

@gulikaontop Год назад

@@hutsons-hacks3668 I think opening the preview of the image and downloading it from there also works. It should be full resolution without the need to actually access the image host.

@asfandiyar5829 Год назад

Cheers for the video, most videos about selenium are outdated so this really helped. Was stuck for hours.

@hutsons-hacks3668 Год назад

Glad it helped

@lykenwanjira5432 2 года назад

Hello, I have followed the video guide but I keep running to error 404 resources not found or error 505. Any advise on this

@hutsons-hacks3668 2 года назад

Looking at the error it is to do with how your JSON is being passed. Can you examine your JSON string and make sure it matches the pattern needed to pass a request to the API?

@swastik07 2 года назад

How much time will it take download 2k photos

@hutsons-hacks3668 2 года назад

So long as Google doesn't cap you not long.

@swastik07 2 года назад

@@hutsons-hacks3668 I m a student preparing a dataset , I don’t hv an idea how long(time) will it take to get 2k , if u can tell it will be helpful ??

@djerodjaha8579 2 года назад

hello very good! Please can us generate openapi.yaml file?

@hutsons-hacks3668 2 года назад

You have to write the open API yaml yourself I am afraid.

@galan8115 2 года назад

Thank you, this is a magnificient addition to my reports!. A question tho, since im migrating from caret to tidymodels: Wit tidymodels do you need to specify something for a multinomial classifier?

@galan8115 2 года назад

Thank you. Any comments on how to plot the auc on thpose rocs and how could you do a k-fold cv instead a training-test-validation approach?

@hutsons-hacks3668 2 года назад

Watch the second part and that should have you covered.

@galan8115 2 года назад

@@hutsons-hacks3668 Lovely, sorry for not having noticed the second part :D

@hutsons-hacks3668 2 года назад

No worries. This might aid you as well: rpubs.com/StatsGary/tidymodels_from_scratch.

@VirtualNigerian 2 года назад

Hey, Thank you for this, but your videos are over stimulating and not easy to understand or implement because you continuously skip through important steps and beginners can not follow you.

@hutsons-hacks3668 2 года назад

Can't win them all.

@louismaiden8360 2 года назад

very basic question but could you give an example of how the API might be used in a practical scenario (i.e. on a job or even at school). my background is in academic research so seeing the power of these types of things can be difficult sometimes! Thank you for the great video!

@hutsons-hacks3668 2 года назад

You train a model on something you want to be able to pass unseen data to predict. For example I created an emergency department predictor, based on patient variables, when this was trained we needed a way to pass production data to it. A good way to do this is an API that is platform agnostic, such as JSON. Meaning this API could be used by R, Python, JavaScript, etc.

@mattm9069 2 года назад

This is very convenient! I appreciate your hard work. Thanks.

@subhajitbasu3250 2 года назад

Thanks mate ! Can you please share the snippet you have used in the demo ?

@hutsons-hacks3668 2 года назад

cran.r-project.org/web/packages/OddsPlotty/vignettes/introduction.html cheers

@PA_hunter 2 года назад

Thanks for the package and tutorial brother

@KN-tx7sd 2 года назад

Hi thanks, can Odds Plotty could be used to plot beta or estimates as you have shown for ODDS ratios.

@hutsons-hacks3668 2 года назад

Original intention was for odds. But if you can find a way to add the effects and estimates, please consider forking the package, making the additions and you would then be a contributor.

@KN-tx7sd 2 года назад

Hi, thanks very useful, 1) Can you please clarify if the MLDataR could be used when your outcome is a continuous variable (e.g., age, birth weight, etc) rather than a categorical variable like you have shown 2) can MLDataR could be used to visualize the outcomes like predicted vs actual as well as ROC?

@hutsons-hacks3668 2 года назад

Yes, you would fit a regression to the problem, such as ElasticNet. The process would be sort of the same in TidyModels, but you would need to set the task to regression.

@siriyakcr 2 года назад

Thanks a lot

@Insipidityy 2 года назад

Hi, thanks for sharing these datasets. What is the source for diabetes and heart disease data? Are these actual data or simulated?

@hutsons-hacks3668 2 года назад

Samples of real datasets from NHS systems.

@abogadorobot6094 2 года назад

Great content, thank you

@Habalabaloooo 2 года назад

Just going to second that default variable labels are quite difficult to work with. I've used glm from the stats package for the data input into odds_plot and the variables are including both the dataframe and vector name. If there's a way to make the default variable name the vector only, that would be a huge help.

@hutsons-hacks3668 2 года назад

Updated in version 1.0.2. Cheers

@hutsons-hacks3668 2 года назад

cran.r-project.org/web/packages/OddsPlotty/OddsPlotty.pdf

@Habalabaloooo 2 года назад

@@hutsons-hacks3668 Hi there! Just following up about coefficient names from glm() outputs. So if a factored level is supplied (i.e. insurance with Private and Government sub-levels), the output becomes the name of the vector and the sub-level (i.e. insurancePrivate, insuranceGovernment). To my understanding, there's no way to currently rename those variables (i.e. inputting a character vector list c("Private Insurance", "Government Insurance"). Similar to OddsPlotty, the texreg package creates coefficient plots that allow you to rename these variables with custom.coef.names and custom.coef.map commands. Is there any way a similar command to rename automatically generated coefficient variables could be implemented in OddsPlottty?

@malikrumi1206 3 года назад

@classmethod???

@hutsons-hacks3668 3 года назад

A decorator tag to say treat this as a class method.

@hutsons-hacks3668 3 года назад

Please note the ConfusionTableR package has changed. Please see how to use: cran.r-project.org/web/packages/ConfusionTableR/vignettes/ConfusionTableR.html

@razorscythe7258 3 года назад

ok thanks

@Al-Foutahwy 3 года назад

Why is the McNemar test is "NA". What does it mean ? please

@hutsons-hacks3668 3 года назад

It is null for multiclass models. You need to refer to the McNemar-Browker version. However, this is a wrapper for caret's confusion matrix, so that does not have this implemented for multiple classification. I hope this answers your questions?

@wildaceds 3 года назад

@6:54 why are you using the full stranded data for training data instead of using the training data; train_data?

@hutsons-hacks3668 3 года назад

Error on recording. It was correct in the code.

@wildaceds 3 года назад

@@hutsons-hacks3668 ah makes sense. Anyways great videos. Although, I still haven't converted to tidymodels yet!

@hutsons-hacks3668 3 года назад

@@wildaceds thanks man. I would still use caret in R, but doing more in Python these days.

@AnimeshSharma1977 3 года назад

Awesome stuff! Looking forward to annealing/lasso and more!! One thing i would like to request is if you can allow handling missing data? For example, using the rfImpute call? Also, is it possible to combine the highly correlated features rather than dropping them completely?

@KevinGee-videos 3 года назад

I'm enjoying the pace and delivery of these python videos, thanks. Just to mention that you have mis-represented what the set union does, it doesn't require the same number of members in each set and doesn't really have any relationship to indexing unless I am missing some nuance - it's hard to get everything right on live video and it got me thinking, but just mentioning in case anyone is confused.

@hutsons-hacks3668 3 года назад

Thanks Kevin. You are right. Thanks for the spot.

@willykitheka7618 3 года назад

Am liking this already!

@hutsons-hacks3668 3 года назад

Thanks 👍

@willykitheka7618 3 года назад

Very helpful indeed.

@willykitheka7618 3 года назад

Thanks alot for sharing...am really looking out for a tutorial on model stacking, preferably from the stack model package...please kindly put out something...cheers!

@Kev42G 3 года назад

Another helpful video, it's great to follow along - I'm using a Jupyter notebook. Gary is improvising on live video , so some minor inconsistencies are inevitable, trouble-shooting the code helps cement your learning. My tips is to keep an eye on the variable names in the enumeration sections at the end ;-) e.g. make teams_list contain all the countries, not ('England', England' .... )

@hutsons-hacks3668 3 года назад

I think I could rehearse a 100 times and still get some wrong.

@Kev42G 3 года назад

Thanks , these videos have a nice for typing along in a Jupyter notebook, great practice for a beginner.

@Kev42G 3 года назад

Thanks for another useful tutorial Gary. Just to show I'm paying attention, you cast 'reality' as an integer in line 93 (around time 10:30 in the video) which negates the need to then fix the format in the print command as far as I can see. Got me thinking though so all good.

@Kev42G 3 года назад

Thanks, that was useful to work through.

@mkklindhardt 3 года назад

Dear Hutsons-Hacks Can you provide your code from this video, please?

@mkklindhardt 3 года назад

Sorry, I found it on your GitHub github.com/StatsGary/FeatureTerminatoR

@mkklindhardt 3 года назад

Three questions: (1) Is there, not a risk when doing feature reductions using a random forest method, and then afterwards using the reduced feature set as the basis for your random forest model? (2) Can this package be used in some way on categorical/factor features? (3) How does it integrate with the #TidyModels framework. Because in the recipe package one can add specified pre-processing recipe steps for accounting for features and highly correlated features. Could you please explain how it differs compared to the preprocessing steps available in recipe()? Thank you!

@hutsons-hacks3668 3 года назад

Not really no, because you are removing the redundant features using mean decrease in accuracy prior to fitting your master model. This was you have a reduced set that will make any other model trained afterwards work more quickly and have better accuracy due to the right features being in the model.

@hutsons-hacks3668 3 года назад

Categorical features would need to be dummy encoded.

@hutsons-hacks3668 3 года назад

Recipes could be used with this. You would need to apply these steps before model training. RFE is not available at the moment in Recipes, as the resampling would cause s slow down on that part of the pipe. As far as I know there are steps for zero variance removal and other types, such as resampling. I have made this to work with other tools like caret and mlr3 in R.

@mkklindhardt 3 года назад

Hey this is really cool! How does this package perform in regression problems? I would like to reduce my number of predictiors for my sub-models. I am using #TidyModels machine learning with stacks() to create an ensemble model. But I have originally >40 predictor features

@hutsons-hacks3668 3 года назад

Use lmfuncs for regression issues