Тёмный

What Is S3 And How Can You Query It With AWS Athena - AWS Data Engineering 101 

Seattle Data Guy
Подписаться 96 тыс.
Просмотров 3,2 тыс.
50% 1

S3 is a commonly used AWS solution for data lakes and staging areas.
Data engineers need AWS and it also supports so many other solutions like Snowflake when hosted on AWS.
So what is S3 and how can data engineers use it?
How can data engineers use it to read from AWS Athena?
Also, I reference a video that shows how to set up an S3 Snowpipe integration, here is the link from ‪@mastering_snowflake‬
• Using Snowpipe | How t...
If you enjoyed this video, check out some of my other top videos.
Best AWS Services You Need To Know As A Data Engineer - How To Become A Data Engineer
• Best AWS Services You ...
Data Modeling - Walking Through How To Data Model As A Data Engineer - Dimensional Modeling 101
• Data Modeling - Walkin...
If you would like to learn more about data engineering, then check out Googles GCP certificate
bit.ly/3NQVn7V
If you'd like to read up on my updates about the data field, then you can sign up for our newsletter here.
seattledataguy.substack.com/​​
Or check out my blog
www.theseattledataguy.com/
And if you want to support the channel, then you can become a paid member of my newsletter
seattledataguy.substack.com/s...
Tags: Data engineering projects, Data engineer project ideas, data project sources, data analytics project sources, data project portfolio
_____________________________________________________________
Subscribe: / @seattledataguy
_____________________________________________________________
About me:
I have spent my career focused on all forms of data. I have focused on developing algorithms to detect fraud, reduce patient readmission and redesign insurance provider policy to help reduce the overall cost of healthcare. I have also helped develop analytics for marketing and IT operations in order to optimize limited resources such as employees and budget. I privately consult on data science and engineering problems both solo as well as with a company called Acheron Analytics. I have experience both working hands-on with technical problems as well as helping leadership teams develop strategies to maximize their data.
*I do participate in affiliate programs, if a link has an "*" by it, then I may receive a small portion of the proceeds at no extra cost to you.

Опубликовано:

 

29 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 18   
@PrinciplesOrDie
@PrinciplesOrDie 3 месяца назад
You could've used Glue - Crawler to create the tables faster you can just alter the DDL code in Athena later if you didn't like the way it was put together
@SeattleDataGuy
@SeattleDataGuy 3 месяца назад
100%! I just wanted to go through the CSV S3 bucket option this time. But I am planning to go over AWS Glue and some of the various glue concepts(the etl, catalog, etc) in the future video. This is meant to be a series so I am trying to only add so much per video.
@AndrewAlarcon17
@AndrewAlarcon17 3 месяца назад
This is was super insightful. Would love more stuff like this!
@SeattleDataGuy
@SeattleDataGuy 3 месяца назад
glad you enjoyed it!
@ansonnn_
@ansonnn_ 3 месяца назад
Thanks for the amazing video again as always. We are using Athena as our main "engine" (not sure if that's the right term) to directly connect with Apache Superset for our dashboarding purposes. Our datasets are mostly in Hudi format and very few in parquet format. We are always querying our datasets from S3 using PySpark. I don't think using another huge data warehouse solution like Snowflake or BigQuery makes sense. Or are we missing out something crucial here? Just some thoughts...
@hansmandler7284
@hansmandler7284 3 месяца назад
Yeah, That's what I literally did last weekend:) Good to see that the professionals do it the same way I did it.
@SeattleDataGuy
@SeattleDataGuy 3 месяца назад
What were you doing? Reading from an S3 bucket
@SeattleDataGuy
@SeattleDataGuy 3 месяца назад
If you guys want to learn more about data engineering, then sign up for my newsletter here seattledataguy.substack.com/ or join the discord here discord.gg/2yRJq7Eg3k
@richardduncan3403
@richardduncan3403 3 месяца назад
I now know why it is called S3. nICE:)
@SeattleDataGuy
@SeattleDataGuy Месяц назад
Glad you found the video helpful
Далее
Документы для озокомления😂
00:24
Data Connection in 10min
9:23
Просмотров 2 тыс.
AWS Hands-On: ETL with Glue and Athena
22:35
Просмотров 26 тыс.
Microservices with Databases can be challenging...
20:52
Документы для озокомления😂
00:24