Тёмный

ksqlDB HOWTO: Filtering 

Robin Moffatt
Подписаться 4,2 тыс.
Просмотров 1,7 тыс.
50% 1

Using ksqlDB you can filter streams of data in Apache Kafka and write new topics in Kafka populated by a subset of another.
ksqlDB uses SQL to describe the stream processing that you want to do. With the WHERE clause you can define predicates to filter the data as you require.
For example:
CREATE STREAM ORDERS_NY AS
SELECT *
FROM ORDERS
WHERE ADDRESS_STATE='New York';
-----
💾 Run ksqlDB yourself: ksqldb.io?.devx_ch.rmoff_youtube_TfX70zBHyPM&
☁️ Use ksqlDB as a managed service: www.confluent.io/confluent-cl...
👾 Demo code: github.com/confluentinc/demo-...
🤔 Questions? Join the Confluent Community at confluent.io/community/ask-th...
-----
⏱ Time codes
00:00:00 Intro
00:00:27 Listing topics on the Kafka cluster in ksqlDB
00:00:53 Consuming from a Kafka topic with ksqlDB using PRINT
00:01:44 Creating a stream in ksqlDB on an existing topic using Avro schema
00:02:20 Show ksqlDB stream schema with DESCRIBE
00:02:33 Nested objects in ksqlDB
00:02:47 Querying a stream in ksqlDB
00:03:16 Accessing nested fields in ksqlDB
00:03:29 Filtering in ksqlDB
00:04:09 Using predicates in ksqlDB
00:04:35 Persisted a continuous query output into a new stream
00:04:53 CREATE STREAM … AS SELECT
00:05:20 Changing auto.offset.reset in ksqlDB
00:06:13 Creating a filtered stream in ksqlDB
00:06:20 Changing the properties of the target topic (name, partition count, etc)
00:07:06 Viewing the filtered topic properties
00:07:27 Viewing the schema of the new stream
00:07:41 Viewing the original and filtered topics together
00:08:39 Recap & Summary

Наука

Опубликовано:

 

3 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 5   
@kevinjang8209
@kevinjang8209 3 года назад
Great tutorial! I liked how you were able to query for the nested object!
@rmoff
@rmoff 3 года назад
Glad it was helpful!
@kurtvicious9375
@kurtvicious9375 3 года назад
hi Robin I am working on project to filter data streaming using ksqldb based on account number and its worked. But how to measure ksqldb performance to filtering data ?. Now my schema for testing performance based on change the variables that affect to streaming process like amount of account number amount of THREAD that produce raw data (multiple producer data) amount of data per thread amount of data that filtered to clean topic. Performance of ksqldb decide based on first & last data timestamp gap that show at C3/kafka tool. any thoughts are welcome, Thanks
@rmoff
@rmoff 3 года назад
Hi Kurt, this is a good question for asking over on forum.confluent.io/ - there's a dedicated ksqlDB category where you should be able to get help.
@Daniel81O
@Daniel81O 2 года назад
Here's a 5 stars question to which I can't find a proper answer. All the examples I found are kinda poor and based on one single element that you filter against. How do you filter this stream of data assuming that you have a list of states (think USA have about 50) but for the sake of the example let's assume you have about 10K states.....When a match is found (hence a state which is part of that 10K list) you send this record onto a new topic...Thats the type of examples I would like to see ...
Далее
ksqlDB HOWTO: Stateful Aggregates
13:56
Просмотров 2,8 тыс.
ksqlDB HOWTO: Schema Manipulation
10:56
Просмотров 1,6 тыс.
$15m Russian helicopter destroyed by Ukrainian drone
00:11
What is Apache Kafka®?
11:42
Просмотров 346 тыс.
ksqlDB HOWTO: Reserialising data in Apache Kafka
17:42
Просмотров 1,9 тыс.
Why The Windows Phone Failed
24:08
Просмотров 122 тыс.
ksqlDB HOWTO: Joins
10:23
Просмотров 2,7 тыс.
Wait... PostgreSQL can do WHAT?
20:33
Просмотров 191 тыс.
I've been using Redis wrong this whole time...
20:53
Просмотров 346 тыс.
What is Kafka?
9:17
Просмотров 452 тыс.
ksqlDB HOWTO: Handling Time
9:08
Просмотров 2,4 тыс.
Battery  low 🔋 🪫
0:10
Просмотров 13 млн