Тёмный

Building Real-Time Analytics Applications Using Apache Pinot 

Data Council
Подписаться 37 тыс.
Просмотров 12 тыс.
50% 1

Get the slides: www.datacouncil.ai/talks/buil...
ABOUT THE TALK
LinkedIn's is the most advantageous social networking tool available to job seekers and business professionals today, with 610+ million members creating millions of posts, videos, and articles that generate tens of millions of shares, comments and likes per day. LinkedIn has leveraged this activity data to build rich interactive user-facing analytics applications like “Who Viewed My Profile”, Talent Insights, Ad Analytics and Publisher Analytics, among others. These applications are all powered by Pinot, as are internal dashboards, anomaly detection and root cause analysis platform like ThirdEye. This talk will present how Pinot has become the de-facto solution for serving analytic queries in milliseconds, ad-hoc reporting, monitoring & Anomaly Detection on multidimensional data.
ABOUT THE SPEAKER
Kishore Gopalakrishna is a founding engineer at a stealth mode startup. Prior to that, he was the architect at LinkedIn’s analytics infra team. Kishore is passionate about solving hard problems in distributed systems. He has authored various distributed systems such as Apache Helix, a cluster management framework for building distributed systems; Espresso, a distributed document store; Apache Pinot, a real-time distributed OLAP engine; and ThirdEye, a platform for anomaly detection and root cause analysis at LinkedIn.
ABOUT DATA COUNCIL:
Data Council (www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: / datacouncilai
LinkedIn: / datacouncil-ai
Facebook: / datacouncilai
Eventbrite: www.eventbrite.com/o/data-cou...

Наука

Опубликовано:

 

26 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 11   
@migueljimenezZ
@migueljimenezZ 3 года назад
Thank you for this excellent talk!
@nipuntalukdar
@nipuntalukdar 3 года назад
Great talk.
@ashypeshy
@ashypeshy 2 года назад
SUPERB
@maclovesgeet
@maclovesgeet 3 года назад
We are using CASSANDRA for metric time series data store. Looking for dashboarding on the top of it. Looked at Superset. BUT superset likes to speak SQL. That research led me to Apache Pinot. How do you compare cassandra vs Pinot for time series data. Numbers. - 1000 metrics, 500k metrics/Minute, 200K dimensions.
@kishoreg1980
@kishoreg1980 2 года назад
If you have only metrics and values, Cassandra is good enough but if you have multiple dimensions for each metric, then something like Pinot is a better option.
@hemanthaugust7217
@hemanthaugust7217 Год назад
​@@kishoreg1980 Do you see any downsides in using a TimeseriesDB such as Prometheus (ignore its alerting & other capabilities, if you don't need them)for this usecase; I agree it's not distributed system. If you have a lot of data, you could explore Grafana Mimir & Grafana UI for dashboarding. Let me know if you see any problems with this solution. It's just 1K metrics and 500k metrics/min datapoints is not a lot. Yes, there are many dimensions to it, and Mimir can shard these and solve it at scale.
@hemanthaugust7217
@hemanthaugust7217 Год назад
@The Leaf Please explore Grafana Mimir too.
@mudunurisrujitha2084
@mudunurisrujitha2084 4 года назад
Is pinot is having any graphql integration point as such?
@kishoreg1980
@kishoreg1980 4 года назад
No.
@mudunurisrujitha2084
@mudunurisrujitha2084 4 года назад
and what is the idea behind choosing the samza as stream processing?
@kishoreg1980
@kishoreg1980 4 года назад
Samza was built at LinkedIn. One can use any system for stream processing - Flink, Spark Streaming, etc
Далее
Intro to Apache Pinot
1:01:26
Просмотров 10 тыс.
КРАФТИМ НЕМЛЕС ФРАГМЕНТЫ
1:05:04
Просмотров 252 тыс.
Inside Apache Druid’s storage and query engine
59:35
Data Warehouse vs Real Time Analytics
12:37
Просмотров 1,8 тыс.
10 МИНУСОВ IPHONE 15
18:03
Просмотров 31 тыс.
ОБСЛУЖИЛИ САМЫЙ ГРЯЗНЫЙ ПК
1:00