Тёмный

Kafka Connect in Action: Loading a CSV file into Kafka 

Robin Moffatt
Подписаться 4,2 тыс.
Просмотров 25 тыс.
50% 1

Ingesting data from a CSV file into a Kafka topic can be done easily using Kafka Connect. Kafka Connect is part of Apache Kafka, and only requires a JSON file to configure - no coding!
This video shows you how to stream CSV files into Kafka, how to manage the schema, and as a bonus also shows streaming the same data out to a database such as Postgres, as well as processing it with ksqlDB.
To learn more about Kafka Connect see docs.confluent.io/current/con...
Code used in this video: github.com/confluentinc/demo-...
--
ℹ️ Table of contents
00:00:44 Brief introduction to Kafka Connect
00:01:39 Checking that the correct Kafka Connect plugin is installed
00:02:46 Kafka, Bytes, and Schemas
00:05:06 Creating a connector to ingest data from CSV file and set the schema based on header row
00:09:43 File metadata stored in the Kafka message header
00:10:18 Setting the message key
00:14:35 Manipulating the schema / changing data types
00:19:22 Ingesting raw CSV into a Kafka topic without a Schema
00:22:43 Streaming data from CSV into Kafka into a Database
00:28:27 Updating database rows in-place from CSV file ingest
00:28:42 Impromptu Kafka Connect troubleshooting ;-)
00:32:34 Filtering and aggregating CSV data in Kafka using ksqlDB
--
💾 Download Kafka Connect Spooldir plugin: www.confluent.io/hub/jcustenb...
🎈 Kafka Connect Spooldir docs: docs.confluent.io/current/con...
🎈 Single Message Transforms (SMT): docs.confluent.io/current/con...
🎈 Cast SMT: docs.confluent.io/current/con...
🎈 Kafka Connect JDBC Source connector: www.confluent.io/blog/kafka-c...
🎈 Kafka Connect JDBC Sink connector docs: docs.confluent.io/current/con...
🎥 Installing JDBC Driver for Kafka Connect: rmoff.dev/fix-jdbc-driver-video
🎥 Streaming data from Kafka to a database: rmoff.dev/kafka-jdbc-video
🎥 JDBC sink and schemas: rmoff.dev/jdbc-sink-schemas
🎥 Introduction to ksqlDB: rmoff.dev/ksqldb-introduction
--
☁️ Confluent Cloud ☁️
Confluent Cloud is a managed Apache Kafka and Confluent Platform service. It scales to zero and lets you get started with Apache Kafka at the click of a mouse. You can signup at confluent.cloud/signup?... and use code 60DEVADV for $60 towards your bill (small print: www.confluent.io/confluent-cl...)

Наука

Опубликовано:

 

3 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 44   
@SuperDstiles
@SuperDstiles 3 года назад
Just getting started with Kafka, but this video makes me realise how useful it’s going to be. Great video, thank you
@rmoff
@rmoff 3 года назад
Thanks!
@paulocasaretto527
@paulocasaretto527 2 года назад
Very useful! Thanks Robin!
@raoulvaneijndhoven1473
@raoulvaneijndhoven1473 3 года назад
This is great, thank you. I have a question regarding Timestamp conversion, but placed it on the community page.
@anak-anakindonesia
@anak-anakindonesia 2 года назад
thanks.this is good video. you save my time bro. 👍
@rmoff
@rmoff 2 года назад
Glad I could help!
@laifimohammed1202
@laifimohammed1202 3 года назад
very awesome i'm beginners and it helpful
@rmoff
@rmoff 3 года назад
Thanks for the comment :) Glad I could help!
@MsMilka87
@MsMilka87 4 года назад
Thanks!
@drakezen
@drakezen 2 года назад
Brilliant
@sdon1011
@sdon1011 10 месяцев назад
Very interesting series of videos. Very helpful. A little remark: at 38:58, it seems that the order value to be inserted was way higher that the currently displayed maximum (22190899.73 vs 216233.09) and still this value was not updated.
@gauravlotekar660
@gauravlotekar660 4 года назад
U d champion !!
@mouhammaddiakhate3546
@mouhammaddiakhate3546 3 года назад
Just Awesome !!
@rmoff
@rmoff 3 года назад
Thanks, glad you liked it :)
@nesreenmohd665
@nesreenmohd665 5 месяцев назад
Thanks
@gregorydonovan7181
@gregorydonovan7181 3 года назад
Great video Robin. QQ - is there a way to publish an event once a file has been completely ingested? Does ksqlDB provide any hooks that might help? I need to ingest a customer file and then send subsets of that file to a number of vendors. I figure I'll have a consumer for each vendor.
@rmoff
@rmoff 3 года назад
I've answered your question over here: forum.confluent.io/t/submit-message-when-csv-has-been-ingested/1658/2
@karthikb.s.k.4486
@karthikb.s.k.4486 Год назад
Thank you Robin for great video. If we edit the same csv file again all records gets processed i think . Can we connect this behaviour. I have seen this with S3 source connector?
@hitrem
@hitrem 2 года назад
Thank you so much for this guide ! But i've got an issue i don't know if it's normal. Personnaly when i cp back a .csv file ( same name ) in my unprocessed directory, the file is processed again and the offset is going from 500 to 501, 502 etc. Is this normal ? Plus when a file is processed it's creating a "orders.csv" subdirectory in the processed directory. Is this due to some update ?
@rmoff
@rmoff 4 года назад
For questions about the connector and Apache Kafka in general please head to confluent.io/community/ask-the-community/
@drhouse1980
@drhouse1980 3 года назад
Nice video, do you know if this connector can get the filename?
@gregorydonovan7181
@gregorydonovan7181 3 года назад
@@drhouse1980 he shows how to get the headers via Kafkacat but the question I have is how to then turn this into a topic that other consumers can subscribe to. For example, after a customer sends a request you then want to ship out requests to vendors then marry the data up later.
@larrosapablo
@larrosapablo 4 года назад
Hi, Is there any connector for csv sink? Thanks!
@Agrossio
@Agrossio 7 месяцев назад
Great Video!! Will this work for processing a csv with 1.000.000 registries ?? Would it last less than an hour to save it in an Oracle Database??
@bisworanjanbarik7350
@bisworanjanbarik7350 3 года назад
This is really great video . I want to load everyday huge csv file into database . Can I use Kafka Csv connector
@rmoff
@rmoff 3 года назад
If you just have a CSV file and a database, I don't think adding Kafka in just to do the load would make any sense - there are plenty of database tools to load the CSV file directly. If you already have the data in a Kafka topic and want to load it into a database then you can use the JDBC Sink connector. For more questions head over to forum.confluent.io.
@shubhamgawali8030
@shubhamgawali8030 2 года назад
Hi Robin great video! I have one requirement can we send only the file name, not the file content whenever the new file is created on the directory
@rmoff
@rmoff 2 года назад
I don't know if there is a connector that does this. You could ask at forum.confluent.io/.
@vishnumurali522
@vishnumurali522 4 года назад
Hi @rmoff May I know how to get the same CSV file data from a SFTP location which use key based authentication...
@vishnumurali522
@vishnumurali522 4 года назад
Can't able to know how mention the key values I am giving request to start the connector from postman
@wardsworld
@wardsworld 4 года назад
Great video and amazing content! Could you please share a repo/link with the code used in this video? :)
@rmoff
@rmoff 4 года назад
Thanks, glad you liked it! The code is here: github.com/confluentinc/demo-scene/tree/master/csv-to-kafka
@kaisneffati8801
@kaisneffati8801 3 года назад
Does it support file changes ? when the file change i want to re-read the file !
@rmoff
@rmoff 3 года назад
The spooldir connector moves files to a new folder once ingested. I don't know if the functionality you describe is available in other connectors. Check out www.confluent.io/hub/streamthoughts/kafka-connect-file-pulse and www.confluent.io/hub/mmolimar/kafka-connect-fs perhaps. For any more questions, head to forum.confluent.io/ :)
@schoesa
@schoesa Год назад
If i run docker-compose up -d it hangs everytime downloading the Kafka Connect JDBC hub plugin
@rmoff
@rmoff Год назад
hi, the best place to get help is at www.confluent.io/en-gb/community/ask-the-community/ :)
@phemsobki1929
@phemsobki1929 3 года назад
Hi How can i move data from a database running sql server on a windows server operating system into kafka
@rmoff
@rmoff 3 года назад
See rmoff.dev/no-more-silos. Connectors suitable include the JDBC Source connector or Debezium.
@phemsobki1929
@phemsobki1929 3 года назад
@@rmoff thanks
@nokap2695
@nokap2695 2 года назад
HTTP/1.1 405 Method Not Allowed X-Confluent-Control-Center-Version: 6.2.1 X-Confluent-Control-Session: 96af01d2-6b69-45ea-937c-1f42c8aa7f78 Strict-Transport-Security: max-age=31536000 Content-Length: 0 5:20 Error, any idea how to fix?
@rmoff
@rmoff 2 года назад
Hi, please post this at forum.confluent.io/ :)
@stefen_taime
@stefen_taime 2 года назад
how to solve : % ERROR: Failed to query metadata for topic orders_spooldir_00: Local: Broker transport failure ?
@rmoff
@rmoff 2 года назад
Hi, please post this at forum.confluent.io/ :)
@georgelza
@georgelza Месяц назад
realized this is a "old'ish" video... you dont show at any time how you started your kafkacat container, also of course now kafkacat has been replaced/renamed to kcat
Далее
Integrating Oracle and Kafka
54:49
Просмотров 22 тыс.
From Zero to Hero with Kafka Connect
33:49
Просмотров 27 тыс.
$15m Russian helicopter destroyed by Ukrainian drone
00:11
Kafka Connect in Action: JDBC Sink
37:14
Просмотров 38 тыс.
Kafka Connect in Action: Elasticsearch
1:05:16
Просмотров 22 тыс.
ksqlDB HOWTO: Reserialising data in Apache Kafka
17:42
Просмотров 1,9 тыс.
Can Apache Kafka Replace a Database?
45:56
Просмотров 8 тыс.
Processing Bitcoin data from Kafka to S3
38:12
Building your First Connector for Kafka Connect
1:18:46
Ingest Kafka events to Pub/Sub and BigQuery
48:18
Просмотров 2,2 тыс.
From Zero to Hero with Kafka Connect by Robin Moffatt
44:41
iPhone socket cleaning #Fixit
0:30
Просмотров 18 млн
КАКОЙ SAMSUNG КУПИТЬ В 2024 ГОДУ
14:59