No video :(

Multiple Imputation: A Righteous Approach to Handling Missing Data

Подписаться 1,6 тыс.

Просмотров 38 тыс.

50% 1

To request the .pdf of the handout please contact us with the name of this presentation at:
www.omegastati...
Sign up for our mailing list to receive the latest news, events, and promotions from Omega Statistics: dashboard.mail...
It will sound like cheating, but it isn't. It's so righteous dude! Multiple imputation (MI) is an effective and responsible way to handle data which is missing at random (MAR). You'll find out what that means too...
Please join Elaine Eisenbeisz, Owner and Principal of Omega Statistics, as she presents an overview of MI concepts. (Original Air Date: August, 2014)

Опубликовано:

21 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 77

@georgezisis1122 3 года назад

OMG!!!!!!!!!!!!!!!!!!! you have no idea how many years of life you saved me. Thanks!

@omegastatistics 3 года назад

Glad to have helped!

@interwebzful 6 лет назад

hey people: start 6 mins into it

@bevansmith3210 5 лет назад

Thank you for the video. Could you kindly respond to this question? For future analyses, which imputed dataset do we use? I know we are meant to use the pooled data, but that means you need to use all 5 imputed datasets each time you want to measure something (mse, regression, etc.) . But what if you just want to impute the missing values and end up with ONE dataset for future use? Thank you, Bevan

@omegastatistics 5 лет назад

Hello Bevan, Please look at Dr. Kumar's response on this link: www.researchgate.net/post/How_can_one_create_a_pooled_dataset_in_SPSS_for_further_analysis There is a part where he states to pool the 5 imputed variables by summing: V1+V2+v3+V4+V5 for instance. However he forgot to say to divide the sum by the number of imputations, in this case 5. So basically you can take the average of the imputed values to derive the pooled values. I am not sure if SPSS now gives pooled standard deviations, and so you may have to do some averaging with standard deviations and standard errors too.

@HealthbeautyluckyshahBlogspot 2 года назад

Great way to teach and I solved the issue I was having. But I have few questions, 1) can you share the reference article to add? 2) I have read few papers and they suggest different methods(linear or logistic) for different type of variables. I noticed you used for all type variables. Can you explain or give reference to article? 3) if I understood correctly data MAR not NMAR?

@omegastatistics 2 года назад

Hello Beautywdbrain, which article did you want? Please give me the name and I will see if I can find a copy. The example I did was for linear regression with a continuous variable outcome. Of course there will be different ways of working with different models. More than I can explain here. Goggle is your friend, do some searching and you will find info. :)

@HealthbeautyluckyshahBlogspot 2 года назад

@@omegastatistics thank you for reply. I wanted to know reference of article to quote if I perform the same method as you did for research. Hope this clears

@talzabidi1569 3 года назад

Hi Dr, thanks so much for your efforts. I would like to ask , is there any conditions we can't use MI to treat missing data?

@omegastatistics 3 года назад

Hello Talal, I am not a Dr., but thanks for thinking so :) The most important is that your data is not MNAR. If your data is MNAR, then imputation and other methods are not going to return good estimates.(read more about missingness and basics of imputation here: www.ncbi.nlm.nih.gov/pmc/articles/PMC2818781/ In this presentation I showed MI using linear regression, which is used for continuous variables. You can also impute nominal variables with logistic regression imputation or discriminant analysis imputation. Here is a link to a paper about imputation of nominal variables: support.sas.com/resources/papers/proceedings/proceedings/sugi30/113-30.pdf

@bonniekenaley3831 6 лет назад

Thank you for your most valuable presentation!

@20jakubukaj08 5 лет назад

Another question: what is the pooling method that is employed in SPSS to aggregate the imputed results? Is this just the mean of the 5 regression coefficients, for example?

@omegastatistics 4 года назад

Look at slide 34 in the presentation at this link I think this may help. rmc.ehe.osu.edu/files/2018/02/0.0-Workshop_missing-data-with-SPSS_Finalaudience.pdf

@20jakubukaj08 5 лет назад

Thank you very much for this useful and clear presentation. I have a question: one thing that you repeat a number of times in the presentation is that you cannot tell if your data are MAR or MNAR. It seems to me that if you find a correlation between missingness in one variable and another manifest variable (i.e. a variable for which you have observed data), then the data is certainly MNAR. Or perhaps I have misunderstood what you meant?

@omegastatistics 4 года назад

Correlations can be spurious, and caused more by a latent variable. What I meant by not knowing for sure, is that you really don't typically know for sure why something is missing because in many cases you can't go back to ask the respondent or data source what happened. Of course if you know something is systematic then you know. But often you won't know for sure.

@roiad876 2 года назад

This has been very instructive, thanks! I'd just ask if you could please show us your syntax output? It would add to the reproducibility of research.

@omegastatistics 2 года назад

If you would like syntax and output, please email me at info@omegastatistics.com with your request. Thanks!

@joannayeung6695 3 года назад

Thank you so much! This is really helpful! May I know if my data sets include both categorical and continuous data, how should I handle it? I have coded the categorical missing data as dummy variables, then should I just do the multiple imputations on the continuous missing data?

@wellsmemphis8170 3 года назад

I guess Im asking the wrong place but does someone know a way to get back into an Instagram account..? I stupidly lost the account password. I appreciate any tips you can give me

@baylortucker1867 3 года назад

@Wells Memphis Instablaster =)

@wellsmemphis8170 3 года назад

@Baylor Tucker thanks so much for your reply. I got to the site on google and I'm waiting for the hacking stuff atm. Looks like it's gonna take a while so I will reply here later when my account password hopefully is recovered.

@wellsmemphis8170 3 года назад

@Baylor Tucker it did the trick and I actually got access to my account again. I'm so happy:D Thanks so much, you saved my ass !

@baylortucker1867 3 года назад

@Wells Memphis You are welcome :D

@luanafantini6366 4 года назад

It was very clear. Thank you!

@omegastatistics 4 года назад

You're welcome!

@lidiabezerra32 3 года назад

Hey, I loved this video. Please, i would like to know some references in the literature which approaches this issue

@omegastatistics 3 года назад

Hi Lidia, Here is a link to start with. There are many references at the end which you can check out too: www.bmj.com/content/338/bmj.b2393#:~:text=Multiple%20imputation%20is%20a%20general,obtained%20from%20each%20of%20them.

@olegstupak7687 4 года назад

It would be interesting to see more theory behind the procedures. Also, the lecture is inconsistent with the literature in some places.

@omegastatistics 4 года назад

Hi Oleg. I know it would be nice, but I am an applied statistician by trade, so although I've been taught the theory, my expertise is in the work. In most of my presentations I provide references that are often more theory based. As for being inconsistent with the literature, things like restraining the limits etc. I wanted to show that it could be done. Many studies are not like the textbooks, not as pretty, and sometimes it is necessary to relax assumptions or allow for limitations. I hope you still enjoyed the presentation.

@oscarbecerril8343 4 года назад

Thank you for sharing, kind Lady.

@omegastatistics 4 года назад

Thanks for watching Oscar!

@bethanyhendriks4596 4 года назад

Can somebody please help, I have a very large data set, when I try to run the impute I get a error message come up on my output which says "contains more than 100 parameters, no missing values will be imputed" Please help

@omegastatistics 4 года назад

Hi Bethany, Here is a link to IBM with info on what you can check. I would check the level of measurement for your variables in the Variable View of your dataset first, then go from there. www.ibm.com/support/pages/multiple-imputation-warning-model-contains-more-100-parameters

@qimeng9800 3 года назад

Thanks for this wonderful video！ May I ask you a question that how could I get or deal with the not-showed pooled p、F SD values when I do the t-test/regression analysis?

@omegastatistics 3 года назад

Hi Qi, I am not sure exactly what you are asking. I know SPSS doesn't give pooled stats for many things, so if you are asking how to pool the standard deviations etc. Then you can start by checking out this link: stats.stackexchange.com/questions/460238/creating-a-pooled-data-set-from-multiple-imputation-output-in-spss

@kennethkoma7762 4 года назад

Hi im trying to impute this data and it has lot of NAs and i cant remove the NAs because there are a lot. so when i run the mice code i get this error: Error in solve.default(xtx + diag(pen)) : system is computationally singular: reciprocal condition number = 3.35108e-20. any idea how to handle this kinda problems.

@omegastatistics 3 года назад

Hi Collins you will need to remove the NAs otherwise the program will think you have string variables, and imputation works on numeric variable. If there is a way you can put your dataset into Excel, then you can do a search for the NAs by clicking on the arrow by "Find and Select" then "Replace" then under the Find What: type NA (Type it exactly like it is in the data) and then in the Replace with: hit the space key. This will erase all of your NA's. Be sure to make a backup of the file before trying this in case things go awry. Hope this helps

@kennethkoma7762 3 года назад

@@omegastatistics thanks will do

@SereniTy_Corner 3 года назад

Please what ad-on do I need in SPSS to be able to perform Multiple imputation. Can't seem to find it in 'analyze'. Also, how do I add it to what I already have?

@omegastatistics 3 года назад

Please check with IBM regarding obtaining a license for the Missing Values add-on and adding it to what you have.

@gracexu602 4 года назад

Great presentation. I have a question. I am analysing longitudinal surveys (3 time points: T1, 2, 3) by using a few scales. I need to compute one of the scales as some questions are negative questions. Then I need to sum each scale at each time points before I ran a post hoc analysis. Due to some missing data with one time point (either T2 or T3), I plan to ran imputation. I want to know if I should impute the original survey results only. Or I should impute all the variables include computed data and sum of each scale.

@omegastatistics 4 года назад

Hi Grace, Typically you will impute your data before you run your tests. But of course, make sure your data is nice and clean first, i.e no strange numbers that shouldn't be there, like an age of 200 or things like that.

@gracexu602 4 года назад

@@omegastatistics Hi Omega, thanks for prompt reply.

@Silverwing_99 3 года назад

great tutorial - i abhor SPSS, wish the practical was in R

@omegastatistics 3 года назад

Yes, we all have our favorites :) I am glad you still found the information useful. Here is a nice tutorial in R: data.library.virginia.edu/getting-started-with-multiple-imputation-in-r/

@Mariajoseschulz 5 лет назад

It's so clear.. Thank you very much.

@estat2127 3 года назад

helpful thanks!

@omegastatistics 3 года назад

Glad it was useful Black Hero!

@majid85 2 года назад

Can this method be used for precipitation data, when a whole year of data is missing? or in some years a few months in a row are missing?

@omegastatistics 2 года назад

It depends on how your data is structured. Also, remember that your data must be MCAR or MAR to use Multiple Imputation. If you are MNAR data then MI should not be used (but is often is). Try to run the imputation and see what you get.

@snigdhodas2848 2 года назад

Great video...I wanted to ask is there any method by which we just impute only certain missing cells and keep the other missing cells unimputed or vacant as it is?

@omegastatistics 2 года назад

Hi Snigdho, You can easily leave entire variables out of the mix, but cell by cell, you may need to use some code with filters. Here is a link to some information on using filters: wlm.userweb.mwn.de/SPSS/wlmssel.htm

@snigdhodas2848 2 года назад

@@omegastatistics thank you

@omegastatistics 2 года назад

You're welcome

@joat9105 3 года назад

so which data we should input to the blank fill, the first one or the other? thanks

@omegastatistics 3 года назад

Hi The Suck, all of the imputed datasets are used when you run the analysis. You use the multiple sets and that is why it is called multiple imputation. Choosing any of the analyses with the little sea shell looking icon next to them from the analysis menu does this automatically for you.

@sefinehfenta4248 3 года назад

Thank you

@omegastatistics 3 года назад

You're welcome!

@datascientist2958 3 года назад

How can we extract pooled imputed data set from SPSS?

@omegastatistics 3 года назад

Check into Rubins Rule: www.ncbi.nlm.nih.gov/pmc/articles/PMC2727536/

@ayhanayhanayhan1 6 лет назад

Perfect! Thanks a lot.

@bramantios5797 6 лет назад

Nice, but i have a question. I ran 20 Iteration and need one single pooled data before making regression, how i can do from 20 iteration datasets into only single pooled data? Thank you in advance

@mohamedhabashy1262 6 лет назад

It is easy if you are using R there is a package called MICE for Multiple Imputation and there is a function called Pool for aggregate your results for any number of iteration

@chrisjoyce92 6 лет назад

I have similar problem. Did you get an answer on how to do this in SPSS?

@mohamedhabashy1262 6 лет назад

you can aggregate you data; go to data menu and select aggregate the function is the mean or mode or median depends on your variable types

@bramantios5797 6 лет назад

Mohamed Hussein please let me know how to do that, i mean can you give specific instruction so i can make a single pooled data. moreover, is there any academic source about how to make a single pooled data? thanks in advance.

@omegastatistics 5 лет назад

Hi Bramantio, SPSS will do that for you if you choose regression and run as usual after imputing. Any processes that can make use of the imputed data will have a little sea shell looking icon next to the routine names in the Analyze menu items.

@guesswhatteapots 5 лет назад

Thank you!

@abc10il 6 лет назад

Great, Thank you

@monalisadas4186 6 лет назад

Can we get the ppt plz..Thanks

@omegastatistics 6 лет назад

Yes! Anyone who wants the handouts, please email your request to info@omegastatistics.com and let us know the particular presentation you need the slides for. Thanks!

@omegastatistics 5 лет назад

Hello Monalisa, you can email me at info@omegastatistics.com and request a powerpoint for any of my presentations. If I still have it I will email it to you.

@sillyflowerdance 5 лет назад

Thanks :)

@zoiyaehtisham818 3 года назад

Hello, its a very helpful information. I am now stuck in a situation, please can you guide me why my minimum and maximum values are not appearing in constraint section when I type zero and 100 ?

@omegastatistics 3 года назад

Hi Zoya, It is hard for me to know what exactly you are asking and to give an answer without seeing your study/data. I do know that the minimum and maximum options for constraints are only available for when you choose "Linear Regression" as the scale variable model type in the "Method" tab. I hope you were able to figure this out.

@zoiyaehtisham818 3 года назад

@@omegastatistics thank you so much for your reply. I figured it out. my data is large and variables are categorical and ordinal and data is missing not at random as I was experiencing maxmodelparam , after giving the maxmodelparam = desired no an error has occured mentioning Warnings The procedure cannot access a file with the given file specification: imputeddata for keyword IMPUTATIONS of subcommand OUTFILE. The file specification is either syntactically invalid, specifies an invalid drive or directory, specifies a protected directory, specifies a protected file, or specifies a non-sharable file. Execution of this command stops.