thank you for such informative video. how can I implement k-means in wireless sensor network for clustering in order to plot paramaters such as residual energy , live nodes ,packets transmitted i.e. throughput.
Hi, could you please provide me your insights on how to apply the clustering code when I have an annual yearly data (time, vehicle 1 charging and vehicle 2 charging)?
Thank you so much for the video. Can you give me any tips. I need to cluster a huge smart meter data set. Hence I have a lot of columns ( or variable) Please help
Hi! Thanks for the very informative video. How would you go around clustering two variables of different types - one numerical, the other string? In this case SpendingScore & Gender
@@KnowledgeAmplifier1 Thank you for the quick reply! I have one more question - what should I do if I have 3 different string variables? Is it ok to assign them to 3 numerical categories (0,1,2 or 1,2,3)? I read here (datascience.stackexchange.com/questions/22/k-means-clustering-for-mixed-numeric-and-categorical-data) that this is the wrong approach because of the different distances between the points: “Categorical data is a problem for most algorithms in machine learning. Suppose, for example, you have some categorical variable called "color" that could take on the values red, blue, or yellow. If we simply encode these numerically as 1,2, and 3 respectively, our algorithm will think that red (1) is actually closer to blue (2) than it is to yellow (3). We need to use a representation that lets the computer understand that these things are all actually equally different. One simple way is to use what's called a one-hot representation, and it's exactly what you thought you should do. Rather than having one variable like "color" that can take on three values, we separate it into three variables. These would be "color-red," "color-blue," and "color-yellow," which all can only take on the value 1 or 0." Sorry for the trouble and thank you very much for your insight!
@@IllSetYouFree Yes one hot encoding we use , see try to understand the difference , there are two different scenario for categorical variable , one is when categorical data are comparable like First , Second , third rank , this time you can assign First=1 , second =2 , third =3 right , but suppose you have categorical data like country where data is India , Nepal , Bhutan , France etc , then you can not assign France=1 , India=2 ... right as they are not comparable ,in this case we go with one hot encoding . I have already uploaded how to handle these 2 scenarios ... Dealing with categorical features in machine learning | MATLAB( One Hot Encoding): ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-wCebbrfInRI.html Categorical Data Handling | Part 2 | Machine Learning | MATLAB: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-_v4UT5qsibk.html Hope these videos will help you to clear all your confusion or doubt about handling categorical data . Happy Coding :-)
Whenever you are dealing with features or parameters that differ from each other in terms of range of values (which is the case in most of data sets) then you have to normalize the data so that the difference in these range of values does not affect your outcome in distance based algorithms.