Тёмный

What is Data Transformation? | What is ETL? | What is Data Warehousing? 

Tech with Azam
Подписаться 2,2 тыс.
Просмотров 14 тыс.
50% 1

My Complete Talend Course on Udemy:
www.udemy.com/course/talend-d...
--------------------- FREE COUPONS OF MY UDEMY COURSE -------------------------------------
I am giving away 3 Discounted Coupons for my Udemy Course every Month.
If you are interested to get them, you need to drop a comment in the Comments Section and my Algorithm will randomly pickup 3 students with whom I will share the discounted Coupons via email.
Comment Should Include: "UDEMY COUPON: I AM IN"
Once I pickup your names, I will ask you to share your email address with me.
The results will be announced soon!
--------------------------------------------------------------------------------------------------------------------------------
Data transformation is the process of converting data from one format or structure into another format or structure. Data transformation is critical to activities such as data integration and data management. Data transformation is a very important part of ETL process & ETL process is mandatory part of Data Warehousing Process. In this Video, some of the most important Data Transformation Techniques are discussed, which are as under:
1. Data Joining
Joining data is one of the most important functions of data transformation. A “join” is an operation in the SQL database language that allows you to connect two or more database tables by their matching columns.
2. Data Deduplication
Data deduplication is a data compression process where you identify and remove duplicate or repeated copies of information.
3. Keys Restructuring
When the tables in a data warehouse have keys with built-in meanings, serious problems can develop. For example, if a client phone number serves as a primary key, changing the phone number in the original data source means that the number would have to change everywhere it appears in the data system. That's why, recommended is to use simple integers keys as primary keys and no business information should be embedded inside the primary keys.
4. Data Cleansing
Data Cleansing involves deleting out-of-date, inaccurate, or incomplete information to increase the accuracy of data. Also referred to as data scrubbing and data cleaning, data cleansing relies on the careful analysis of datasets and data storage protocols to support the most accurate data possible.
5. Data Validation
Data validation is the process of creating automated rules or algorithms that engage when the system encounters different data issues. Data validation helps ensure the accuracy and quality of the data you transform. For example, a rule could go into effect when the system finds that the first three fields in a row are empty (or NULL value).
6. Data Format Revision
Format revisions fix problems that stem from fields having different data types. Some fields might be numeric, and others might be text. One data system could treat text versus numeric information differently, so you might have to standardize the formats to integrate source data with the target data schema. This could involve the conversion of male/female, date/time, measurements, and other information into a consistent format.
7. Data Derivation
Data derivation involves the creation of special rules to “derive” the specific information you want from the data source. For example, you might have a database that includes total revenue data from sales, but you’re only interested in loading the profit figures after subtracting costs and tax liabilities.
8. Data Integration
Data Integration is the process of taking different data types (like different databases and datasets relating to sales, marketing, and operations) and merging them into the same structure or schema. As a primary goal of ETL for data warehousing purposes, data integration supports the analysis of massive data sets by merging multiple data sources into an easy-to-analyze whole.
9. Data Filtering
Data filtering includes techniques used to refine datasets. The goal of data filtering is to distill a data source to only what the user needs by eliminating repeated, irrelevant, or overly sensitive data. In its most practical form, data filtering simply involves the selection of specific rows, columns, or fields to display from the dataset.
10. Data Splitting
Data splitting refers to dividing a single column into multiple columns.For example: Address could be parsed and the hidden street name, house number, post code, city etc could be taken out and stored in new columns. This supports the technique of denormalization.

Опубликовано:

 

1 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 7   
@techwithazam6992
@techwithazam6992 3 года назад
My Talend Course on Udemy: www.udemy.com/course/talend-data-integration-using-talend-open-studio/?referralCode=00F56563FA9680DE364D ---------------------- FREE COUPONS OF MY UDEMY COURSE -------------------------------------- I am giving away 3 Discounted Coupons for my Udemy Course every Month. If you are interested to get them, you need to drop a comment in the Comments Section and my Algorithm will randomly pickup 3 students with whom I will share the discounted Coupons via email. Comment Should Include: "UDEMY COUPON: I AM IN" Once I pickup your names, I will ask you to share your email address with me. The results will be announced soon! --------------------------------------------------------------------------------------------------------------------------------
@ovaispathan5065
@ovaispathan5065 2 года назад
Awesomeeeee session brother.... and u must be a super soft spoken person :):) love from USA ....
@tabrezmohammed
@tabrezmohammed 3 года назад
Thanks bro, very helpful..
@suhas3785
@suhas3785 3 года назад
Really great
@fdecjrgm
@fdecjrgm 3 года назад
UDEMY COUPON: I AM IN
@helloantonova
@helloantonova 2 года назад
Thank you!
@nusratsdailys7507
@nusratsdailys7507 3 года назад
Super
Далее
New: Top 10 Best ETL Tools [2024] 🚀🛠
0:59
How do indexes make databases read faster?
23:25
Просмотров 47 тыс.
Data Migration Process Overview
7:43
Просмотров 13 тыс.
Kubernetes Explained
10:59
Просмотров 599 тыс.