This talk is from #KafkaSummit Americas 2021
📝 Abstract:
The great thing about streams of real-time events is that they can be used to spot behaviours as they happen and respond to them as needed. Instead of waiting until tomorrow to find out what happened yesterday, we can act on things straight away.
This talk will show a real-life example of one particular pattern that it's useful to detect-ships engaged in potentially suspicious behaviour at sea. Transhipping is often used for legitimate purposes to optimise efficiencies but can also be used for nefarious purposes such as illegal fishing.
By capturing streams of maritime AIS data in real-time into Kafka and processing it with ksqlDB, it's possible to detect the kind of characteristics that could indicate behaviour of interest, such as ships moving slowly at close proximity for a length of time.
I'll demonstrate how the data was ingested from a raw TCP feed, unified with reference data from CSV files, and then processed to spot patterns with the resulting real-time stream of matches written to a new Kafka topic for validation and analysis.
---
⏱ Timecodes:
00:00:00 Introduction
00:01:30 Ingesting AIS data into Apache Kafka
00:02:12 Creating an Apache Kafka managed cluster
00:03:18 Handling different message types in the same Apache Kafka topic
00:05:14 Routing Kafka messages to different topics using ksqlDB based on message type
00:07:04 Modelling a Kafka topic into a ksqlDB table (state store)
00:09:14 Enriching ship movement reports with ship attribute data using a Stream-Table join in ksqlDB
00:11:09 Visualising real-time AIS data from Kafka with Elasticsearch and Kibana
00:13:03 Explanation of transshipping
00:13:37 Defining transshipping in stream processing terms
00:14:32 Ingesting additional data from CSV file into Kafka
00:15:50 Adding the new reference data to the existing ship info table
00:16:46 Building ksqlDB stream processing queries to do pattern matching
00:17:23 Stream-Stream join in ksqlDB
00:18:50 ksqlDB Stream-Stream join type and window explanation
00:19:28 Determining the distance between two vessels using the built-in GEO_DISTANCE function
00:20:10 Using the data lineage view to visualise the stream processing components and data flow
00:20:39 Visualising ships in close proximity in Kibana
00:20:45 Using a session window aggregation in ksqlDB
00:24:11 Streaming data from Kafka to Elasticsearch using the managed Elasticsearch connector
00:24:59 Exploring the finished pattern match results in Kibana map view
00:26:58 Recap and Summary
---
📚Resources:
👉 Sign up for Confluent Cloud rmoff.dev/l2f -- use code RMOFF200 for money off your bill
👾Code to try it yourself: github.com/confluentinc/demo-...
🧑🏻🏫 Read the two blogs that inspired this talk:
📕 rmoff.dev/ais01
📕 rmoff.dev/ais02
3 авг 2024