Тёмный

How To Fix Missing Data In Matlab [Machine Learning] 

CodingLikeMad
Подписаться 3,4 тыс.
Просмотров 4,5 тыс.
50% 1

In this matlab tutorial I go over basic techniques for filtering out and filling missing data in Matlab. This machine learning video covers how to filter out missing data, as well as techniques for imputing missing data in matlab like linearly interpolate missing data, do nearest neighbor interpolation, as well as some of the reasons why sometimes you can't do those things. New videos come out every week, so check back often!

Наука

Опубликовано:

 

29 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 4   
@spoonfairy1723
@spoonfairy1723 4 года назад
I got a question about how to represent that a variable does not happen. For example, I got water and lots of variables, pressure, contamination and temperature change over time going from -10C to 110C And what I want to know if when each variation of the water freezes and boils. But sometimes the water does not freeze at all in that range, because maybe it would freeze in -20C, how do I best represent this in matlab? I can't put a 0 because that would be seen as it freezes at 0C, but if I leave it blank it puts NaN and throws out the whole line :E
@CodingLikeMad
@CodingLikeMad 4 года назад
Hi Oscar, thanks for your really interesting question! So for sure, filling 0s is a bad idea, like you mentioned. Unfortunately, I can't give you a "This is the official way" line, because the truth is that it is case by case! What makes YOUR case interesting is that there is an implied bias in the result. Often if data is missing it means that the result was simply not available, ie, they didn't make the measurement, or the machine was faulty, etc. Here, if it doesn't occur, it often means that the result an extreme one, so indeed, putting 0 in is very bad! I can't fully answer your question without knowing what model you feed it into next unfortunately. But lets say that every case is what you said, a first step is to assign not 0 but say -10 to every case. In fact, you probably know that it will not be any lower than ~-18 unless it is contaminated with some weird antifreezes, so this is already quite a bit better, but of course it is still not good for a scientific study. One option would be to build a model on the data you do have which predicts the temperture of freezing, and then fill in the gaps in the other data points using this. IE, learn how to predict this on average for between 0 and -10, and then apply this to -15 range values. This assumes that the data you DO have contains some information about the freezing temperature that is missing. This is probably the BEST advice. I give two more solutions below if you are already doing machine learning work with the parameters. If you are using a non-linear multivariate approximation model (such as a neural network or a random forest) and you have enough data, another option is to provide a "sentinal value" such as -20, which only occurs when you have missing data. By using such an extreme value, the model can choose to do something different with this result, and by choosing it to have the right sign you can have it at least roughly behave correctly during training and application (ie, you get to reuse the parametrs in the model in the neural network, no need to build two networks internally). Another approach that is occationally done here is to add an additional variable on top of the sentinal variable - this one is a true/false boolean, which indicates if the value is missing. This helps it switch the behavior, and the sentinal value can be anything in that case but should probably be 0. Finally, I do remember learning in a course about a class of models where you can fit a model with data which provides bounds rather than values for some of the data points. I can't recall the name for this class of model, and to my knowledge they aren't popular, but there may be some research in that direction. Sorry there is no "best practice" available here, but it really depends on what "Missing" means!
@TechLearnWithShaghayegh
@TechLearnWithShaghayegh 2 года назад
fillmissing doesn't work in matlab 2014.what is the alternative of fillmissing?
@CodingLikeMad
@CodingLikeMad 2 года назад
That's a pretty old version. For that case, my suggestion would be to do it manually unfortunately. Convert your table to an array (I have a video on that if you have trouble), and then use logical indexing to fill the values - something like arr[ isnan(arr) ] = fillValue
Далее
Forecasting using Matlab Regression Learner app
14:10
How to Fit a Linear Regression Model in MATLAB
5:28
Просмотров 17 тыс.
Самый СТРАННЫЙ смартфон!
0:57
Просмотров 35 тыс.