When you do processing in ksqlDB that is based on time (such as windowed aggregations, or stream-stream joins) it is important that you define correctly the timestamp by which you want your data to be processed. This could be the timestamp that's part of the Kafka message metadata, or it could be a field in the value of the Kafka message itself.
By default ksqlDB will use the timestamp of the Kafka message. You can change this by specifying WITH TIMESTAMP='…' in your CREATE STREAM statement, and instead identify a value field to use as the timestamp.
Use the ROWTIME system field to view the timestamp of the ksqlDB row.
-----
💾 Run ksqlDB yourself: ksqldb.io?.devx_ch.rmoff_youtube_scpbbl71CD8&
☁️ Use ksqlDB as a managed service: www.confluent.io/confluent-cl...
👾 Demo code: github.com/confluentinc/demo-...
🤔 Questions? Join the Confluent Community at confluent.io/community/ask-th...
📎 Syntax for the TIMESTAMP and TIMESTAMP_FORMAT arguments in the WITH clause: docs.ksqldb.io/en/latest/deve...
-----
⏱ Time codes
00:00:00 Introduction
00:02:52 Examining a timestamp in the message value
00:03:22 Creating a time-based aggregate - without considering timestamps properly :)
00:04:09 ROWTIME in ksqlDB
00:05:01 Event time vs Processing time
00:05:56 How to change the timestamp that ksqlDB uses for ROWTIME
00:06:50 Validating that ROWTIME has been set to the timestamp field in the message value
00:07:35 Aggregating data based on the timestamp in a field in the message value (not the Kafka message timestamp)
00:08:24 Recap
3 авг 2024