Тёмный

code.talks 2023- Data Validation at Scale: Managing Data Quality in Complex Data Pipelines 

code.talks (ehem. Developer Conference)
Подписаться 4,1 тыс.
Просмотров 90
50% 1

by Wolfram Wingerath
Since data processing is at the core of many businesses today, ensuring good data quality is often required for smooth operations and valid business decisions. But what does "good" data quality actually mean? When is it "good enough"? And how to make sure it stays "good enough" in the face of growing data volumes and evolving business processes?
In this presentation, we will introduce you to challenges and best practices for data validation in data-intensive domains, highlighting its critical impact on everything from product optimization over customer reporting to machine learning use cases. We will start with dimensions along which data quality can be quantified, before we explore concrete strategies for ensuring them. We will discuss how requirements can be specified using data constraints and will illustrate this with a practical example. Finally, we will highlight the inherent challenges of handling data validation in Big Data applications and share our experience from having done this for more than 10 years.
By the end of the talk, you will not only understand the significance of data validation in a data-centric world, but you will also have a grip on why this is a complex task and how it can be accomplished at scale.

Наука

Опубликовано:

 

18 ноя 2023

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии    
Далее
ПОМОГЛА НАЗЫВАЕТСЯ😂
00:20
Просмотров 2,9 млн
Армия США вошла в Зангезур
04:17
Просмотров 263 тыс.
What are AI Agents?
12:29
Просмотров 124 тыс.
code.talks 2023 - You don't have to choose
31:50
The moment we stopped understanding AI [AlexNet]
17:38
Просмотров 861 тыс.
КРУТОЙ ТЕЛЕФОН
0:16
Просмотров 7 млн