Тёмный
Data Analytic
Data Analytic
Data Analytic
Подписаться
Demystify the data science, visualisation with simple, practical and concise answers to your data science and visualisation questions.

We have hundred of videos covering various aspects of Data Analytics. Our aim is to provide concise, practical and fit for purpose content for Data scientists .You would regularly find new R, Python, C#, database, automation related content on your channel.

In today's world just using one tool is not enough, like a tradesman toolkit which has various tools for various needs.
Enhance your toolkit by learning tools fit for purpose.

Browse through our full range of videos on visualisation using GGPLOT. When creating charts for a document it is important to have the same look and feel for all your charts. We have a full suite of GGPLOT charts tutorials.

We thank everyone in helping support this channel.

Please subscribe ru-vid.com?s_confirmation=1

R  Hex bin map of Australia
7:48
21 день назад
Password Protect Excel File
1:22
Месяц назад
Комментарии
@Damian12312
@Damian12312 10 дней назад
THX, but what if in my country Monday is first day and I'd like to show data in that order?
@DataAnalytic
@DataAnalytic 8 дней назад
Hi That can be achieved as following by creating a new column and using following DAX command. Hope it helps. For Monday as start of week MondayStart = WEEKDAY('Table'[Admit Date] ,2) For Tuesday as start of week TuesdayStart = WEEKDAY('Table'[Admit Date] ,3) For Saturday as start of week. The options above are only 1, 2 or 3 so to make Saturday as start of week. SaturdayStart = IF(WEEKDAY('Table'[Admit Date], 1) = 7, 1, WEEKDAY('Table'[Admit Date], 1) + 1) All the best.
@Damian12312
@Damian12312 8 дней назад
@@DataAnalytic thanks a lot!!! I wait for next videos 😊
@DataAnalytic
@DataAnalytic 7 дней назад
Thank you for your kind words.
@Adeyeye_seyison
@Adeyeye_seyison 23 дня назад
Great to see another tutorial from you sir... Its been a while you unloaded contents. Trust you are doing great over there and your loved ones? Thanks a million for the previous contents _ they indeed gave me good directions in my journey of using R.
@DataAnalytic
@DataAnalytic 22 дня назад
So thankful for your kind words!
@enuarora
@enuarora Месяц назад
It is a video which i am also looking for but the font is extremely blurred. Will ne very helpful if you could share the formula in a bigger and clear font. Thanks in advance
@DataAnalytic
@DataAnalytic Месяц назад
Hello there, I will ensure that the DAX formulii are zoomed in. Here is the code AgeGroup = SWITCH( TRUE, ISBLANK(AGES[Customer Age]), "Unknown", AGES[Customer Age] <= 4, "0-4 yrs", AGES[Customer Age] <= 9, "5-9 yrs", AGES[Customer Age] <= 14, "10-14 yrs", AGES[Customer Age] <= 19, "15-19 yrs", AGES[Customer Age] <= 24, "20-24 yrs", AGES[Customer Age] <= 29, "25-29 yrs", AGES[Customer Age] <= 34, "30-34 yrs", AGES[Customer Age] <= 39, "35-39 yrs", AGES[Customer Age] <= 44, "40-44 yrs", AGES[Customer Age] <= 49, "45-49 yrs", AGES[Customer Age] <= 54, "50-54 yrs", AGES[Customer Age] <= 59, "55-59 yrs", AGES[Customer Age] <= 64, "60-64 yrs", AGES[Customer Age] <= 69, "65-69 yrs", AGES[Customer Age] <= 74, "70-74 yrs", AGES[Customer Age] <= 79, "75-79 yrs", AGES[Customer Age] <= 84, "80-84 yrs", AGES[Customer Age] > 84, "85+ yrs" )
@enuarora
@enuarora Месяц назад
Thanks a lot! Appreciated
@ldsharma6546
@ldsharma6546 Месяц назад
Sir do you conduct classes on r studio?
@DataAnalytic
@DataAnalytic Месяц назад
Hi, happy to create more information videos on the topics you would like to suggest.
@ldsharma6546
@ldsharma6546 Месяц назад
Sir, It is a very informative and wonderful video. Thank you very much for considering my request. I hope your good self will create more informative videoes in the future 🙏
@DataAnalytic
@DataAnalytic Месяц назад
Thanks and welcome
@shamusenright5387
@shamusenright5387 Месяц назад
Great video. What was the procedure for getting the entire list of base r colours to change colour?
@DataAnalytic
@DataAnalytic Месяц назад
Hi, to see the entire lists of colors simpy type the command colors()
@shamusenright5387
@shamusenright5387 Месяц назад
@@DataAnalytic Thanks. I meant getting the list of colour names from the colour() function output to then change colour?
@DataAnalytic
@DataAnalytic Месяц назад
Hi Not sure if I got the question right. But I will try to give you a generic answer and hope that it covers what you wanted to ask... 1. Type the colours() command and it gives you a list of all the colours listed in the terminal window. 2. Copy all these values from tthe terminal window and copy them in your script and then you are able to see the actual colours also, which makes it easy to choose the colour which you want. 3. In order to use the colours you have two choices, eg. if you want all the points to be of the same colour then you can set the colour outside the aesthetics. If you want the colour to be different based on some other grouping then give the colour command within the aesthetics eg . aes (colour = gender), this was each gender will get different colour. and then you can have another line to specify the colours eg. scale_colour_manual(values = c('red', 'blue')) Hope this is what you initially wanted to ask. Let me know if I haven't been able to answer you question. All the best.
@shamusenright5387
@shamusenright5387 Месяц назад
@@DataAnalytic Thankyou very much for the detailed reply. That's exactly what I was after (and more). Cheers
@DEXTER-bn9zu
@DEXTER-bn9zu Месяц назад
Can we do the same in python
@DataAnalytic
@DataAnalytic Месяц назад
Hi , I haven't tried it myself but try this outt pypi.org/project/fuzzywuzzy/
@maheshkshirsagar6361
@maheshkshirsagar6361 Месяц назад
please share the dataset file in discription
@DataAnalytic
@DataAnalytic Месяц назад
Hi The dataset is generated by code and it is available at rpubs.com/techanswers88/913784 Hope it helps. All the best.
@nathasyapramudita6312
@nathasyapramudita6312 Месяц назад
Is it still work the same if I simplyfy the code as: ``` data |> ggplot(aes(x = date)) + geom_line(aes(y = patients)) + geom_point(aes(y = patients)) + geom_line(aes(y = death)) + geom-point(aes(y = death)) + Theme_classic() ``` Without the clutter p1 every new lines 😊
@DataAnalytic
@DataAnalytic Месяц назад
Hi Yes, it will work perfectly, this syntax is short and lot of people prefer this style. Pros Easier to write, quick, less typing Cons If you have a complex code then it is hard to read, I use the pl <- pl + geom_classic() like syntax, If I have 30 charts in a report then I can easily search for it and comment it or modify it easily. Hope it helps. All the best.
@matheusdeluna1409
@matheusdeluna1409 2 месяца назад
Very useful video!
@DataAnalytic
@DataAnalytic 2 месяца назад
Glad it was helpful!
@univ41soukahras48
@univ41soukahras48 2 месяца назад
hi, I have a problem when I want to run the following code to have a boxplot: boxplot(data_ind$Na~data_ind$Stations, range = 1.5, width = NULL, varwidth = FALSE, notch = FALSE, col = c("blue","red","orange","gray"), xlab = "", ylab = "Na %",) the following message appears: Error in plot.new() : figure margins too large. and the drawing does not appear the range of the variable Na% from 17.39 to 43.02
@DataAnalytic
@DataAnalytic 2 месяца назад
Hello You are not using the GGPLOT boxplot you are using the BASE R to plot your chart. You can try to use the dev.off() command to see if it works for you. Check the margings using this command par("mar") set the margins by the following command par(mar=c(1,1,1,1)) Here is an example library(tibble) data <- tribble(~ Gender, ~ Age , 'Male' , 80 , 'Male' , 40 , 'Male' , 60 , 'Male' , 70 , 'Female' , 80 , 'Female' , 30 , 'Female' , 40 , 'Female' , 50 , 'Female' , 60 , 'Female' , 70 , 'Female' , 150 ) dev.off() par(mar=c(1,1,1,1)) boxplot(formula = Age ~ Gender, data = data, col = c("red","blue"), xlab = "Patient Gender", ylab = "Patient Age") Hope it works for you.
@univ41soukahras48
@univ41soukahras48 2 месяца назад
@@DataAnalytic thanks
@stephenkojwang1151
@stephenkojwang1151 2 месяца назад
One of the best tutorials ever. Thanks a bunch!
@DataAnalytic
@DataAnalytic 2 месяца назад
Glad you think so!
@ldsharma6546
@ldsharma6546 2 месяца назад
Good morning Sir, I have analysed 750 soil samples for soil acidity in the state of Manipur, India. So, how can I develop the same maps in ggplot for my data. Kindly teach me
@DataAnalytic
@DataAnalytic 2 месяца назад
Hello, yes I assume that you have taken soil samples for tested for the soil acidity for Manipur, 750 locations is a good number, I will try to do a mockup.
@ldsharma6546
@ldsharma6546 2 месяца назад
​@@DataAnalyticSir could you share your email id with me?
@DataAnalytic
@DataAnalytic Месяц назад
Hi, I have prepared a demo for you to plot Manipur Chart and I will soon upload it. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-jvpRKndh5RA.html
@ldsharma6546
@ldsharma6546 Месяц назад
Sir, thank you I'm eagarly waiting
@DataAnalytic
@DataAnalytic Месяц назад
Hi, thanks for your patience, I have create a video for Manipur District Map with data points.
@nuestraaula1991
@nuestraaula1991 3 месяца назад
excellent video, thank you so much!
@DataAnalytic
@DataAnalytic 3 месяца назад
You're very welcome!
@nuestraaula1991
@nuestraaula1991 3 месяца назад
@@DataAnalytic from Barranquilla, Colombia 🙏❤️🇨🇴
@Bungizy
@Bungizy 3 месяца назад
Thank you. I have two datasets: case (d1) and control (d2). Both d1 (n=226) and d2 (n-=219) have unique persons per observation. I want to fuzzy_left_join matching d2 to d1 on gender (M/F), and age_in_months (lowage and highage based on +/- 6 months). Your tutorial worked, but my joined had n=3067 observations because multiple d2 meet the match criteria of gender and age range for every d1 observation. My problem: in the djoined of n=3067, if I remove duplicated d1 records and downsize to n=226, some unique records from d2 will be dropped and not represented. My request: how do I retain only (n=226) d1 observations, while maximizing the number of d2 matched without duplication? Thank you.
@DataAnalytic
@DataAnalytic 3 месяца назад
Hi, thanks for explaining your question very clearly. As you have a control dataset and a case dataset and one person will only appear in either the control or the case dataset, so you would not have a unique person ID to join the data. In your case you are joining the gender and the age range, is it possible to pick some other key which can help you ? The same record from d1 is joined to multiple rows in d2. Another way is to only take the first value in each group, so you group your joined dataset by gender, agegroup and pick the top1 using slice(1). Without looking at the data, I can only give generic advice which may not fit in your exact circumstances. If you are able to make some dummy data then put the code here, then I can give it a further try.
@boipelokedikilwe1302
@boipelokedikilwe1302 3 месяца назад
Thank you for this. But how can you interact with the chart that you created in R? Instead of clicking the Power BI chart, but rather clicking the R chart?
@DataAnalytic
@DataAnalytic 3 месяца назад
Hi the R charts reflect the changes in other POWERBI components, in the video it shows that when we select a particular product in a PowerBI chart then the R chart also get changed to reflect the selected product. But there are no controls within the R Chart to interact directly.
@nuestraaula1991
@nuestraaula1991 3 месяца назад
no music, please
@DataAnalytic
@DataAnalytic 3 месяца назад
Hi, yes I agree, it is one of the older videos, we do not use background music at all now!
@raokramer
@raokramer 3 месяца назад
Thank you !!
@DataAnalytic
@DataAnalytic 3 месяца назад
Thanks
@MizanurShuvraRidwan
@MizanurShuvraRidwan 3 месяца назад
i tried running the gtsummary package and it shows errors because my variables are factors. So, I guess y9ou need to mention what type of variables work for the gtsummary cross tabs
@hugohenriquez9992
@hugohenriquez9992 3 месяца назад
Nice video! . I got a question, how can I use data with NA values ? .
@DataAnalytic
@DataAnalytic 3 месяца назад
Try filling the NAs with a string like 'Not Available' or 'Blank', but be aware that the NAs might be at each node level, so do this write at the start.
@zoef2170
@zoef2170 3 месяца назад
How do you fix the distortion of the hexagons? (i.e., AK and HI). They are not all uniform at the end.
@DataAnalytic
@DataAnalytic 3 месяца назад
Hi, in this example the hexagons are controlled by the shape file, hence can't be changed to make them uniform. There are other methods which can achieve uniform size for the hexagons, as in that method the hexagons are not in the shape file but generated programatically. But the second method does not ensure that you would get one hexagon for each state.
@KevinVelan
@KevinVelan 4 месяца назад
Thanks, brother, exactly on what I been searching
@DataAnalytic
@DataAnalytic 4 месяца назад
Glad to hear it
@joseoscardelgadobautista2105
@joseoscardelgadobautista2105 4 месяца назад
Hi!!!! excellent!!! but for work again with R?
@DataAnalytic
@DataAnalytic 4 месяца назад
Open an existing R script or open a new R Script page and you can start working in R.
@John_F898
@John_F898 4 месяца назад
Don’t the slashes in paths need to be forward slashes in R? Do you set slash direction in preferences?Also, why are there double slashes in this example?
@DataAnalytic
@DataAnalytic 4 месяца назад
Backslashes \\ have to be double and forward slashes / are single. You can use whatever you like.
@jordyvanlooy4715
@jordyvanlooy4715 5 месяцев назад
🙂🙂🙂
@justiflower3993
@justiflower3993 5 месяцев назад
informative.
@TegeElleMusic
@TegeElleMusic 5 месяцев назад
Awesome video!
@Ashis_Udgata
@Ashis_Udgata 5 месяцев назад
Nice video. I am unable to open diva-gis website for downloading shape file. Can you help me out?
@DataAnalytic
@DataAnalytic 5 месяцев назад
Hi if the site is not working then just try hub.arcgis.com/content/cba8bddfa0ab43ddb35a7313376f9438/about www.indiaremotesensing.com/2017/01/download-india-shapefile-with-official.html www.igismap.com/download-india-administrative-boundary-shapefiles-states-districts-sub-districts-pincodes-constituencies/ www.researchgate.net/post/Suggestions_how_to_get_Shapefile_of_India_with_all_700_Indian_districts
@Ashis_Udgata
@Ashis_Udgata 5 месяцев назад
@@DataAnalytic Thank you for quick reply. Can I get your mail id. I am getting some error while implementing the code.
@DataAnalytic
@DataAnalytic 5 месяцев назад
Please feel free to put it in a comment so that I can reply.
@Ashis_Udgata
@Ashis_Udgata 5 месяцев назад
@@DataAnalytic Is it possible to add different numerical values to the selected districts in the map? For eg. population size, number of govt. schools etc.
@nagehansahin9335
@nagehansahin9335 5 месяцев назад
Hi, is there any way to create three-dimensional bar chart?
@DataAnalytic
@DataAnalytic 5 месяцев назад
Hi, yes there are different libraries to do that. Here is a very simple example install.packages('"plot3D") library(plot3D) scatter3D(iris$Sepal.Length, iris$Petal.Length, iris$Sepal.Width) Also explore echarts4r .
@nagehansahin9335
@nagehansahin9335 5 месяцев назад
@@DataAnalytic But scatter3d is not a bar chart?
@DataAnalytic
@DataAnalytic 5 месяцев назад
Hi, sorry I was just eluding to the plot3D library for you to have a look.
@Nyashaelon
@Nyashaelon 6 месяцев назад
nice 🙏
@DataAnalytic
@DataAnalytic 5 месяцев назад
Thanks 🙏
@constanzamartinez1780
@constanzamartinez1780 6 месяцев назад
Hi..thanks for your video...I´m having problems with geom_sankey maybe you can help me Error in `geom_sankey()`: ! Problem while computing stat. ℹ Error occurred in the 1st layer. Caused by error in `map()`: ℹ In index: 1. Caused by error in `dplyr::mutate()`: ℹ In argument: `dplyr::across(c(x, next_x), ~as.numeric(.), .names = ("n_{.col}"))`. Caused by error in `across()`: ! Can't select columns that don't exist. ✖ Column `next_x` doesn't exist. I verify and I have the columns next_X I appreciate you help
@DataAnalytic
@DataAnalytic 6 месяцев назад
Looks like something is wrong with the data. There is a link in the description of the video which has the code with some sample data. If you have not seen it yet, then please try to run the code and see if you are able to get a plot. If you still have an issue, then paste some dummy data in the comment so that I can try it out.
@JetLagRecords
@JetLagRecords 6 месяцев назад
Data Analytic, Wow, this made my day brighter! Thank you!
@trid3nt749
@trid3nt749 6 месяцев назад
Thanks a lot, for some reason I couldn't find this anywhere on stackoverflow
@DataAnalytic
@DataAnalytic 6 месяцев назад
You're welcome
@mirlot1298
@mirlot1298 7 месяцев назад
Felicitaciones, gracias por tan valiosa información
@michaelmahoney3806
@michaelmahoney3806 7 месяцев назад
Great job walking me through the construction of this plot. I don't utilize this plot design a great deal so I appreciate your tips and tricks. Thanks!
@DataAnalytic
@DataAnalytic 7 месяцев назад
Thank you so much!
@Jazzmaster11
@Jazzmaster11 7 месяцев назад
It's very usefull, thank you so much !
@DataAnalytic
@DataAnalytic 7 месяцев назад
Glad it was helpful!
@MariánFrais
@MariánFrais 7 месяцев назад
My confidence interval does not look like the normal distribution as yours. What might be the problem? Each of my cases have different se.
@DataAnalytic
@DataAnalytic 7 месяцев назад
Hi, I think you will have very extreme numbers in one or more observations.
@CanDoSo_org
@CanDoSo_org 7 месяцев назад
Hi, thanks. But why don't you use: cat(glue("There are **{nrow(diamonds)}** diamonds in the dataset.")) instead of: glue("There are **{nrow(diamonds)}** diamonds in the dataset.") Same output but the later one is simpler.
@DataAnalytic
@DataAnalytic 7 месяцев назад
Correct, by default if you give the name of a variable, it gets printed so all three will give you same results. cat (glue('There were {nrow(diamonds)} diamonds in the dataset. ')) glue('There were {nrow(diamonds)} diamonds in the dataset. ') print(glue('There were {nrow(diamonds)} diamonds in the dataset. '))
@nadiadansani2139
@nadiadansani2139 8 месяцев назад
I'm getting a python error with install_python (version = version). It's saying I need to download git and ensure it is in the path
@DataAnalytic
@DataAnalytic 8 месяцев назад
Hi, try to install Python standalone and see if gets installed properly.
@NamgayNamgay-t1g
@NamgayNamgay-t1g 8 месяцев назад
Very useful. Thank you.
@undzmoi106
@undzmoi106 8 месяцев назад
Hi thanks for explaining, is it work with ggplotly? Cause I add images logo on geom_sf map with geom_image fonction but this doesn't work when add ggplotly, how can I do this 🥺
@morninglorya918
@morninglorya918 8 месяцев назад
Thank you so much for this video! I am wondering if the highcharter package allows you to skip nodes for some observations. For example, I have three nodes. Some observations have values at all three nodes and this is straightforward (n1 > n2 > n3). However, some of my observations do not have an observation at n2. I would like for these observations to be depicted in the diagram as going directly from n1 > n3. Is this possible?
@DataAnalytic
@DataAnalytic 7 месяцев назад
Hi there You can also have a look at the ggplot sankey plots, but it still will create an empy node at n2 level instead of going directly from n1 to n3. But I guess you have some control if you create your data in such a way so that the n2 and n3 are put at the same level, matter of experimenting it out.
@Mohammed-yl5wr
@Mohammed-yl5wr 8 месяцев назад
Thank you so much for sharing the informative video. I want install ("import") SGDClassifier From sklearn.linear_model as shown below From sklearn.linear_model import SGDClassifier
@Adeyeye_seyison
@Adeyeye_seyison 8 месяцев назад
Sir, I have a video request: Your dplyr playlist was my go-to treasure trove that helped my learning _ came across the fill() & tidy() functions in tidyverse _ but don't know what they function _ would you sir, help with a tutorial on the functions?
@DataAnalytic
@DataAnalytic 8 месяцев назад
Noted, will do.
@Adeyeye_seyison
@Adeyeye_seyison 8 месяцев назад
@@DataAnalytic Thanks a million sir!
@Adeyeye_seyison
@Adeyeye_seyison 8 месяцев назад
Thanks a million sir, for ALL you do and represents and your value adding contents and tutorials.
@danielmoralesnavarro
@danielmoralesnavarro 8 месяцев назад
thank you for the video!
@sutchak
@sutchak 8 месяцев назад
Very useful video. However my question is if there is some real need to use sjplot instead of ggplot2
@DataAnalytic
@DataAnalytic 8 месяцев назад
In some cases the sjPlot gives lots of features with very little code, eg. showing the perecentage and the frequency on a barchart. Also the histograms in sjPLOT are really cool. But of course everything is achievable using GGPLOT2. I personally like the sjPlot features for plotting the regression models. Using the plot_model www.rdocumentation.org/packages/sjPlot/versions/2.8.15/topics/plot_model.
@Salty0
@Salty0 9 месяцев назад
From technicality perspective it is alright but line chart is not applicable here. Each of the cars are different to be on a continuous axis.
@DataAnalytic
@DataAnalytic 9 месяцев назад
Yes, you have pointed it right, these are kind of discrete values joined together with lines, though this chart was easier to demonstrate the techniques of annotations, otherwise with continuous data the chart was getting really busy.
@Svykle
@Svykle 9 месяцев назад
How do I display each data point on the x axis?
@DataAnalytic
@DataAnalytic 9 месяцев назад
Hi In the line chart (assuming that it has dates on X axis),if you have fewer points to display then you can nicely display them, if you have more points then try rotating the X axis labels. And if you still have more points then use the dual axis as we have shown in the video. One of the strategies should work for you. Hope I got your point correctly and answered accordingly, otherwise feel free to provide bit more information on what you want to do.
@AkshayGaneshkumarNSx
@AkshayGaneshkumarNSx 9 месяцев назад
Excellent demonstration
@DataAnalytic
@DataAnalytic 9 месяцев назад
Thank you!