Тёмный

DSWS Munging 

Dr Akhter Raza
Подписаться 714
Просмотров 222
50% 1

Data Cleaning
In today's data-driven world, the ability to understand and leverage data is crucial. Data Science combines statistics, computer science, and domain expertise to extract valuable insights from data, helping businesses make informed decisions, optimize operations, and drive innovation. This free online workshop series is designed to guide you through the fundamental concepts and techniques of Data Science, equipping you with the skills to harness the power of data. Eight lectures, a lecture in a month.
Registration Link: bit.ly/3KoaFRy
Join WhatsApp Group:
My RU-vid Channel: / @drakhterraza
Web Site: studyspace.pk/
Email: dr.raza.akhter@gmail.com, akhter@fuuast.edu.pk
Download Tools: R and R-Studio
Workshop Topics:
1. Data Collection and Munging
Date: Friday June 28, 2024
Time: 8 to 10 PM
Data is the foundation of any data science project, but raw data often comes with imperfections. In this session, you'll learn how to gather data from various sources and clean it to ensure its quality and reliability. We'll cover methods for data collection, techniques for handling missing values and outliers, remove inconsistencies, replications, splitting data in training & testing and best practices for preprocessing data to prepare it for analysis.
2. Data Visualization
Date: Saturday July 27, 2024
Time: 8 to 10 PM
Once data is collected and cleaned, the next step is to visualize it in a way that reveals patterns and insights. This session will introduce you to the principles of data visualization and the tools you can use to create compelling charts and graphs. You'll learn how to effectively communicate your findings through visual representations and how to choose the right type of visualization for your data. What are the main advantages of data visualization.
3. Describing the Data
Date: Saturday August 31, 2024
Time: 8 to 10 PM
Understanding your data through descriptive statistics is crucial for initial data exploration. In this session, we'll delve into measures of central tendency (such as mean, median, and mode) and measures of dispersion (like range, variance, and standard deviation). These concepts will help you summarize your data and understand its distribution and variability.
4. Inferential Procedures
Date: Saturday September 28, 2024
Time: 8 to 10 PM
Moving beyond description, inferential statistics allows you to make predictions and generalizations about a population based on a sample. This session will cover hypothesis testing, confidence intervals, and significance tests. You'll learn how to draw meaningful conclusions from your data and understand the importance of statistical significance in your analyses.
5. Predictive Techniques 1
Date: Saturday October 26, 2024
Time: 8 to 10 PM
Predictive modeling is at the heart of many data science applications. Starting by difference b/w supervised and unsupervised learning techniques then we will explore various techniques for building predictive models, including regression analysis and logistic regression. Their assumptions, data requirements, using predict command. We'll also discuss how to evaluate and validate the models to ensure they perform well on new, unseen data. Some cases will be solved using these two models.
6. Predictive Techniques 2
Date: Saturday November 30, 2024
Time: 8 to 10 PM
Classification tree, Regression Tree, and clustering will be introduced with their real life examples in real life business problems.
7. Time Series 1
Date: Saturday December 28, 2024
Time: 8 to 10 PM
Introduction to Time Series: Learn about time series data, its characteristics, and common applications.
Discover how to import time series data into R, handle missing values, and preprocess the data for analysis. Get hands-on experience with R packages like ggplot2 and other tools for visualizing time series data. Understand the components of time series data (trend, seasonality, and residuals) and how to decompose them using R.
8. Time Series 2
Date: Saturday January 25, 2025
Time: 8 to 10 PM
Dive deeper into advanced techniques and forecasting methods. On the second day, we'll learn about stationarity, its importance in time series analysis, and how to achieve it through differencing. Explore how to use autocorrelation and partial autocorrelation functions to understand the dependencies in your data. Get introduced to popular time series models such as ARIMA, SARIMA, and Exponential Smoothing. Use R packages to build and validate forecasting models. Learn how to evaluate the accuracy of your forecasts using various metrics and techniques.
Data Collection Intro: (0:00)
Cleaning: (18:48)
Imputation:(54:43)
Convertion into factor:(58:54)
Remove duplicates: (1:01:29)
Making Tables:(1:06:24)
Making Subsets: (1:08:40)
Outliers:(1:11:00)
Train Test Split:(1:29:01)
#datascience
#ml
#statistics
#cleaning
#datacleaning
#datacleansing
#datacleanup

Опубликовано:

 

17 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии    
Далее
Learn R in 39 minutes
38:56
Просмотров 609 тыс.
Data Analytics vs Data Science
6:30
Просмотров 403 тыс.
impossible to understand how😨❓
00:14
Просмотров 8 млн
All Learning Algorithms Explained in 14 Minutes
14:10
Просмотров 197 тыс.
Kademlia, Explained
24:22
Просмотров 16 тыс.
R workshop vid2
24:45
Просмотров 152
Exploratory Data Analysis with Pandas Python
40:22
Просмотров 442 тыс.
What is LangChain?
8:08
Просмотров 169 тыс.
RAG from the Ground Up with Python and Ollama
15:32
Просмотров 26 тыс.
How to Become a Data Scientist in 2024
8:04
Просмотров 129 тыс.
R workshop vid1
26:43
Просмотров 207