Welcome to Dr. Akhter's RU-vid Channel, go-to destination for insightful explorations into the realms of data science, statistics, and numerical analysis. Led by a seasoned university professor
Embark on a journey with us as we delve deep into the world of data, uncovering its hidden patterns, extracting valuable insights, and harnessing its power to drive informed decision-making. Whether you're a novice seeking to grasp the fundamentals or a seasoned practitioner looking to expand your expertise, our videos offer something for everyone.
Through engaging lectures, tutorials, workshops, and real-world case studies, we brings complex concepts to life, making them accessible and understandable for learners of all levels. From statistical methods and machine learning algorithms to numerical techniques and computational tools, we cover a wide spectrum of topics designed to enhance your proficiency in the field.
Subscribe now and unlock the secrets of data-driven discovery!
I don't understand why "Total cases" field use as sum but not count. As per my understanding, it should be count since it supposed to demonstrate count of total cases
During video editing, many important steps got missed which were crucial to add in order to understand whole process. Just a feedback and would be great if gets improve in upcoming videos. Thanks
See our work using very strong soft computing approach of FSVM and ANN.I appreciate as you are my one of good students of PhD (CS) and Statistics as well.
assalm-o-alaikum. 1) sir what is the use of set.seed at 13:59 2) at 31:09 the table seems like confusion matrix but you said its contingency table, so what is the difference between these two tables? 3) in rpart at 35:47 the "." include all independent variable but how it exclude dependant variable?
Wa Salam seed command will generate random numbers from the specified location during training and testing phase we should use it. After successful testing at the time of deployment we should remove this command from the code. confusion matrix is a contingency table of two factors both are same but the purpose are different. confusion matrix is used to check model accuracy but confusion matrix is used in chi-square technique to check independence of two factor variables
@@drakhterraza at 9:51 you said runif(nrow(iris) is the rand() lilke function to generate random number, that's why i am asking the purpose of set.seed(9850)
Q1: If variance of 2 population is same then which test is performed? Q2: Which test is used to determine if the variances for 2 populations is equal? Q3: Keeping everything same for the example 10.3 in the video, test the hypothesis that private sector salaries are higher than that of private sector. Give answer as conclusion.
1. T4 Pooled t test is done if variance of two population is same 2. F- test is used to determine if the variance of two population is equal or not. However you can also use boxplot and the condition s1/s2 <2 to confirm if variance is equal or not.
3. The t-value is 2.395 which lies in the critical region, therefore the null hypothesis (that the two sectors have same average salaries) is rejected. Data provides sufficient evidence that average salaries in private sector are greater than average salaries in public sector. (Critical Region Approach) According to p-value approach, the p value is approximately 0.01 which is less than alpha=0.05. Therefore, still the same conclusion remains that the null hypothesis is rejected. And the data provides sufficient evidence that average salaries in private sector is more than average salaries in public sector.
Q1 Pooled t test will be used if the variance of population are same. Q2 We can use box plot Moreover a formula of s1/s2 can be used to determine if the variance are same. If the value of s1/s2 is less than or equal to 2 so we say that the variances are same. Using this formula it is necessary to keep greater s.d in numerator
1) Pooled T-test (T4) is performed if variance of 2 population is same 2)F-Test can be used to determine, also an informal test, s(large)/s(small) <2 can also be performed. variances can also be compared by sketching Box plot
Q1) Pooled t-Test (Procedure T4) is applied when variances of 2 populations are equal. Q2) F-Test is the formal procedure through which we test whether two population variances are equal. Other informal procedures include: 1) The ratio of two sample standard deviations and whether that ratio is less than 2 ( s(large)/s(small) < 2 - then we can get an idea that the two population variances are equal) 2) Constructing boxplots and comparing the width of the two boxplots. If width is approximately the same, then we can have an idea that the two population variances are equal. Q3) At 5% significance level, data provides sufficient evidence to conclude that salaries in the private sector are higher than salaries in public sector. The test results are statistically significant at 5% significance level.
Q1: Why is the difference of population means in the z test not shown in the formula? Q2: In the Industrial engineer problem from the video suppose mean production of day workers is 345 units and that of night workers is 335 units. keeping everything else same test the hypothesis that day production is greater than that of night production. Give the conclusion as answer.
Q1: As both population means are equal, their difference is equal to 0. Q2- At 5% significance level The data provides sufficient evidence to conclude that the day production is greater than night production.
Q1) The two population means are same which causes the difference of the two population means to come out to zero. Q2) At 5% significance level, data provides sufficient evidence to conclude that average number of units produced on the day shift is greater than that of the night shift. The test results are statistically significant at 5% significance level.
Q1: If we are selecting samples without replacement which formula is used? Q2: If there are two samples of size 2 and 3, how many possible pairs can be made from them? Q3: From question 2, what is the probability of selecting one particular pair from them? Q4: Mean of sampling distribution of difference of 2 means is equal to _____________? Q5: If the population standard deviation for two populations is 3 and 5, find the standard deviation of difference of two means for samples of sizes 10 and 12 respectively?
Ans 1: NCn; where N is population size and n is sample size Ans 2: 6 Ans 3: 1/6 Ans 4: Difference of the individual means of the two populations Ans 5: 1.727