Тёмный

Top AWS Services A Data Engineer Should Know 

DataEng Uncomplicated
Подписаться 18 тыс.
Просмотров 151 тыс.
50% 1

This is a high-level overview that the top services on AWS a data engineer should know in order to solve their data engineering challenges. I explain it by using an example of integrating 2 different data sources to create a central data repository to enable our hypothetical analytics team to perform their own self-service analytics. This video is broken down into Data Ingestion, Data Lake, Transformation, Data Warehouse, Data Analytics, Application Integration, Data Pipeline Orchestration, and Monitoring.
Timeline
00:00 Introduction
01:05 Data Ingestion
04:24 Storage - S3
05:10 Transformation
05:47 Data Catalog
06:44 Data Warehouse
07:13 Data Analytics
09:08 Application Integration
10:28 Orchestration
11:57 Monitoring
buy me a coffee: www.buymeacoffee.com/dataengu
useful links:
AWS Serverless Data Lake Architecture: • AWS Serverless Data La...
Optimize Data Lake: • 3 Tips To Optimize You...
SNS vs SQS: • SNS vs SQS Comparison?...
#AWS
#dataengineering

Опубликовано:

 

16 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 167   
@cringe6006
@cringe6006 Год назад
Woah woah I know nothing about AWS Then why the heck did i totally totally understand this video It was crystal clear I usually don't subscribe to channel but this time it was not even a question 👍🏾 Man I wish you could teach me all about data engineer
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks for the wonderful feedback! I'm glad the way I explained it was helpful. Thanks for subscribing! More AWS related content to come.
@cringe6006
@cringe6006 Год назад
@@DataEngUncomplicated thank you Eagerly waiting for more content 😁
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks, I try to strike a balance with overview videos and technical tutorials on aws. Fun fact, I think you are my 5,000 subscriber!
@cringe6006
@cringe6006 Год назад
@@DataEngUncomplicated 👏🥳🎉🎊
@renzcarillo7277
@renzcarillo7277 2 года назад
As a self taught data engineering student, figuring out what services to start with aws is very hard - this indeed uncomplicates everything!
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thank you for the kind words Renz! I'm glad it was helpful.
@francismagnusson378
@francismagnusson378 Год назад
hey there, could you give a link to a resource for data engineering? im about to start a job in DE and im kinda intimidated with the various skills needed for the job. I already know Python and SQL (which is why i was hired, or so im told) but i know nothing about DE. im about to start this udemy course on Python, SQL, and Pyspark, but im afraid it might not be enough. any help would be appreciated, thanks!
@samb23692
@samb23692 Год назад
@@francismagnusson378 Hi, how did you proceed? How's your job going on?
@bartstough8201
@bartstough8201 3 дня назад
Still a great overview. Makes everything a lot clearer. Thank you.
@DataEngUncomplicated
@DataEngUncomplicated 3 дня назад
Glad it was helpful!
@vibhavmishra2002
@vibhavmishra2002 Год назад
I am glad I found this video. Brilliant overview. cheers !!
@mehmetkaya4330
@mehmetkaya4330 Год назад
Such a great video! Summarized basic AWS services for data engineering very nicely! One of the best! Thanks!
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thank you very much!
@rahulsrivastava9787
@rahulsrivastava9787 11 дней назад
The concepts in this video went inside my brain like a hot knife going in butter. Great video for someone like me who comes from a functional background. Great work...really appreciated.
@ricardolizano8851
@ricardolizano8851 8 месяцев назад
This is pure gold. Thanks!
@saiduluchintha3766
@saiduluchintha3766 2 года назад
Excellent video.. the sequence you have covered this in is seamless. I am surely having this for quick reference.
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks so much for your kind words Saidulu. I really appreciate it.
@LittleDetours
@LittleDetours Год назад
Love your clarity on the topic. Subscribed! Can't wait to explore all your videos👀
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks for the feedback and subscribing!
@jihanzhang5527
@jihanzhang5527 Год назад
Great video. Something to add here: S3 Select can be used for quick and adhoc querying dealing with single S3 file. Athena can also work directly with S3 files if you just need some quick data understanding and investigation. EMR Serverless can address the headache for managing EMR cluster and in the meantime gives your more power for ML.
@jasminew7573
@jasminew7573 2 года назад
Great video Adriano! It helped me understand all the AWS services better.
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thank you! I'm glad the video was help.
@user-ij4ih8qp3e
@user-ij4ih8qp3e Месяц назад
Thank u so much. Your tutorial helps me a lot.
@Ghillieye
@Ghillieye Год назад
Great overview and I think your method of slowly explaining the diagram section by section is brilliant! A follow up video of a real use case would be even better. Subbed!
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Hi Adrian, Thanks for the feedback and subscribing! Can you elaborate on your suggestion? Are you thinking of an actual hands on tutorial or overview use case type video?
@user-bq9ph5im1q
@user-bq9ph5im1q 9 месяцев назад
Thanks for creating this video. You explained the concepts very clearly.
@DataEngUncomplicated
@DataEngUncomplicated 9 месяцев назад
You're welcome. I'm glad you found the video helpful
@nareshs7710
@nareshs7710 Год назад
simply well articulated
@shreyaroraa2234
@shreyaroraa2234 Год назад
One of the best AWS explanation I saw so far
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks Shrey, much appreciated!
@go556
@go556 Год назад
It is so far the most helpful video I saw about aws services for DE. I hope there are more likewise. Thanks a lot for sharing!
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks for the comment! Yes, new videos related to data engineering and AWS every week!
@nikeating
@nikeating 2 года назад
Great video. Super well summarised
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks Nic!
@endpermia
@endpermia 9 месяцев назад
Thank you so much for this video! I have an interview tomorrow and this boosted my understanding and confidence. Great explanations!
@DataEngUncomplicated
@DataEngUncomplicated 9 месяцев назад
Thanks, Best of luck with your interview!
@kennedysigauke953
@kennedysigauke953 Год назад
Very informative, thanks!
@cludianobre
@cludianobre 8 месяцев назад
fantastic video. Thanks for this.
@chriscrocker438
@chriscrocker438 Год назад
I'm currently working on my AWS certification and will be referencing the diagram from this video often. Thanks for the clear and concise walk through of the context of each of these services!
@DataEngUncomplicated
@DataEngUncomplicated Год назад
You're welcome Chris, I'm glad it was helpful. Good luck on your AWS certification!
@sushilamahato
@sushilamahato 2 года назад
I found this video very useful as a learner. Thank you!
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks Sushila, I'm glad you found it helpful!
@johndanson4427
@johndanson4427 3 месяца назад
10/10 in 2022 - although in the last year or 2 - - - apache iceberg, spark and kafka have got added into the mix - surfaced as "need-to-haves", rather than 'also useful'. Still the best Data Engineering overview demo on YT.
@DataEngUncomplicated
@DataEngUncomplicated Месяц назад
Thanks John, yea I was thinking at some point to update this for 2024 or 2025. You are right, there are also new services like lake house architecture related data formats such as iceberg, delta or hudi that are now supported.
@victoriwuoha3081
@victoriwuoha3081 2 года назад
@DataEng Uncomplicated. This has to be one of the best explanation of how I can use AWS for my data analytics engineering workloads. Thank you for the detailed summary of the various services.
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks Victor, much appreciated!
@clovisfilho93
@clovisfilho93 Год назад
Great video!! Thanks for sharing, it really help me to better understand AWS tools
@DataEngUncomplicated
@DataEngUncomplicated Год назад
You're welcome. I'm glad it was helpful!
@SlimmDrea
@SlimmDrea 2 года назад
This was perfect! Exactly what I was looking for lol.
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks Andraya!
@melimesesan5786
@melimesesan5786 4 месяца назад
Awesome job!
@ajprasad6865
@ajprasad6865 Месяц назад
Clear and concise
@obiebbw6630
@obiebbw6630 2 года назад
As someone else commented. I'm learning to be a Data engineer and learning what each application is used for has been a struggle. I'm learning the Azure system, but seeing this visual helped. New sub.
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks Obie! Much appreciated
@Omer698
@Omer698 4 месяца назад
Your channel is a god send. Data Engineering channels are rare on youtube and those that do exist are tailored towards Indian Students. Thank you for the content and you've got a new subscriber.
@DataEngUncomplicated
@DataEngUncomplicated 4 месяца назад
You're welcome! Thanks for subscribing!
@oluwatobitobias
@oluwatobitobias 2 года назад
God bless the works of your hand....great job
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thank you Oluwatobi!
@tamasensei550
@tamasensei550 3 месяца назад
This video is really great. As an ETL developer, I aspire to become a data engineer in the next few years. Your explanation is very clear!
@DataEngUncomplicated
@DataEngUncomplicated 3 месяца назад
Glad it was helpful!
@BeABetterDev
@BeABetterDev 2 года назад
Amazing Video!
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks Dan!
@tiisetsomokhesi5206
@tiisetsomokhesi5206 2 года назад
I am starting as Data Engineer with a company that uses AWS ( I am from Azure background), this video has been really helpful with the architecture and services.
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks Tiisetso! I'm glad this video was helpful. Thank you for leaving me a comment
@naraendrareddy273
@naraendrareddy273 9 месяцев назад
Hands down the number 1 video for beginner Data Engineers
@DataEngUncomplicated
@DataEngUncomplicated 9 месяцев назад
Thanks for your kind words!
@networkfreddy2000
@networkfreddy2000 4 месяца назад
Quickly subscribed. Currently a AWS Cloud Engineer for a AI Company so I've been upskilling in Data Engineering . Planning to take the DEA-C01 exam. Great information and your presentation style is perfect!
@DataEngUncomplicated
@DataEngUncomplicated 4 месяца назад
Thanks so much for the kind words! I'm glad it was helpful. Good luck on the exam!
@ravivarma8988
@ravivarma8988 Год назад
It works! Thanks a lot.
@groundingtiming
@groundingtiming 4 месяца назад
wow this is awesome!
@brozkeeper
@brozkeeper 11 месяцев назад
You sir, are the main man. Thank you.
@DataEngUncomplicated
@DataEngUncomplicated 11 месяцев назад
Haha. You're welcome!
@saquib513
@saquib513 2 года назад
This is such a great video. Any chance you will be doing a full fldge video on implementing these tools together? And I love your teaching style, I would love to know if you offer any courses that I can take.
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Hi Nazz, thank you for your kind words! Yes I plan on making a playlist that has technical tutorials on implementing each component so if you subscribe to my channel, you will get notified when those videos are released! Unfortunately I don't offer any courses at this time, I'm just focusing on making RU-vid videos to help data engineers on AWS!
@sergiojulio
@sergiojulio 2 года назад
Great video thanks
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
You're welcome Sergio, thanks for leaving a comment.
@senthilsds
@senthilsds 2 года назад
I am looking for hands on experience. This video helps me understand concepts better
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks Senthil, I'm glad it was helpful.
@jamespaz4333
@jamespaz4333 Год назад
Wowww excellent video. Thank you very much. Is there any course that you could recommend to learn these specific tools?
@sailpawar6164
@sailpawar6164 Год назад
i had watched so many other videos on same topic..this is the one i was looking for even though i didn't know what exactly i was looking for as everything was new
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks sail! I'm glad it was what you were looking for. What were you searching for on RU-vid exactly?
@sailpawar6164
@sailpawar6164 Год назад
@@DataEngUncomplicated i am familiar with hadoop environment..i wanted to know how to do all of it in aws..now i know! thanks
@demohub
@demohub 2 года назад
Thanks for sharing
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
You're welcome!
@gilang6128
@gilang6128 20 дней назад
love this
@mwaqze
@mwaqze Год назад
Hi there, thanks for such a wonderful explanation of a complex topic. Can you share the diagram picture through a link please?
@rofu37
@rofu37 Год назад
Channel name checks out
@andrewting3081
@andrewting3081 Год назад
Bruh, thank you SO MUCH!
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks Andrew!
@jwtsfj
@jwtsfj 2 года назад
You are a legend sir
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks for your kind words John!
@techtransform
@techtransform 2 года назад
You are awesome
@309-baby7
@309-baby7 Год назад
great introduction to these services. is there a specific data integration service to get data from salesforce (cRM) source?
@DataEngUncomplicated
@DataEngUncomplicated Год назад
AWS glue appears to have salesforce connectors so that would be an option. I'm sure you could do it in lambda functions as well if your data is small enough as well
@sandeepsingavarapu3839
@sandeepsingavarapu3839 2 года назад
Very informative video, Thank you. I am trying to learn Data engineering and trying to do some real world projects. Could you create few videos for End to End data engineering projects with and also some real world projects/ideas to try.
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Hi Sandeep, yes, this is high on my video list! Thanks for the suggestion!
@Bill0102
@Bill0102 5 месяцев назад
Your work is truly impressive; it reminds me of a book I read that had a similar impact. "AWS Unleashed: Mastering Amazon Web Services for Software Engineers" by Harrison Quill
@pankajjagdale2005
@pankajjagdale2005 10 месяцев назад
Great Video..!! AWS App flow is missing........... Thank you
@DataEngUncomplicated
@DataEngUncomplicated 10 месяцев назад
Great point, I know this service is being used more recently
@samb23692
@samb23692 Год назад
Hi, Great video. Can you give suggestions on how to start learning these services?
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Sure, I can provide some suggestions on how to start learning AWS services: 1. Start with the AWS Free Tier: AWS offers a free tier for many of its services, which allows you to explore and experiment with them without incurring any charges. This is a great way to get started and familiarize yourself with the AWS platform. 2.Take online courses and tutorials: AWS provides a wealth of resources for learning, including online courses, tutorials, and documentation. You can start with the AWS Training and Certification website, which provides a range of free and paid courses on various AWS services. 3. Join AWS user groups and forums: Joining user groups and forums can be a great way to learn from other AWS users and get answers to your questions. AWS provides an official forum, as well as many user groups around the world. 4. Practice with real-world scenarios: Once you have a basic understanding of AWS services, try to apply what you have learned to real-world scenarios. This will help you understand how the services work together and how they can be used to solve real-world problems. 5. Get certified: AWS offers a range of certifications for different roles and levels of expertise. Getting certified can be a great way to demonstrate your skills and knowledge to potential employers.
@Larry21924
@Larry21924 3 месяца назад
This is pure perfection. I read a book with similar content, and it was pure perfection. "Mastering AWS: A Software Engineers Guide" by Nathan Vale
@DataEngUncomplicated
@DataEngUncomplicated 3 месяца назад
Thanks Larry!
@chandrabhatt
@chandrabhatt Год назад
Subscribed
@mercantilism954
@mercantilism954 Год назад
Thank you for the great video. I have one question. Wouldn't it be very costly to use all of the AWS services? I store lots of data in S3 and it costs $100-150 a month.
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Hi, it all depends on your use case for your data and access patterns. For example although you have 10 TB of data, it doesn't mean you are querying all 10 TB in every query and rather only doing queries on subsets of your data.
@helovesdata8483
@helovesdata8483 Год назад
I'm preparing for an data engineer interview. The company is looking for someone good at creating pipelines in aws. I'm going to use your videos. I read so many different definition for "ingestion". Ingestion comes right after extraction in the ETL process, right.
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks! Glad the videos are helpful. I hope your interview went well!
@jamisonlewis4884
@jamisonlewis4884 Год назад
Very well done! Don't forget the importance of data lineage though. Big time clients always want the capability to visually track data lineage.
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks Jamison! Yes! This is important but there isn't a dedicated data lineage or governance service released yet in aws...datazone was announced at reinvent which should fill this gap hopefully
@gabrieljeca
@gabrieljeca 2 года назад
Good content! But how about AWS Managed Workflows for Apache Airflow for orchestration? Wouldn’t it be better to orchestrate lambdas and glue jobs with MWAA?
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Hi Gabriel, thank you! Yea this is a great point, this could have been a service added to the orchestration component of the diagram. It's a good option but I don't think it's "better "necessarily since you it's another server you have to pay for the server to keep running 24/7 vs step functions and glue orchestration are serverless and only pay per x # of invocations.
@gabrieljeca
@gabrieljeca 2 года назад
@@DataEngUncomplicated Thanks for the answer and the great insight there. I guess going serveless is always the best option. But execution logs of both from glue orchestration and step functions are accessible in cloud trail?
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
The logs for glue and step functions are actually accessible in cloud watch logs.
@BenOgorek
@BenOgorek Год назад
Great video! 2 questions: 1) Any reason you didn’t mention DMS? 2) What services help you out with database changes (deltas)?
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Hey Ben, good call out on DMS. DMS is a good service for data engineers to learn if their focus is on data migration. For database changes, I have used both aws glue or lambda functions depending on the size of the data and building the delta logic in python.
@iamdare
@iamdare 2 года назад
thanks for this. do you have a course on Udemy on Data Engineering?
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Thanks Dare, Unfortunately i don't yet
@user-ym9hn4km8l
@user-ym9hn4km8l 8 месяцев назад
Why there is an arrow from AWS Glue Catalog to the Data warehouse (Red Shift)?
@DataEngUncomplicated
@DataEngUncomplicated 7 месяцев назад
Glue catalog works on databases as well as data lakes so you can define your redshift datasets in AWS glue to keep track of them
@Draco-pu4ro
@Draco-pu4ro Год назад
Hi, What Services should I use if I have a source which sends CSV files and the schema changes every week? The column names are different and new columns were added each time. Ideally need to expose the data from these files into tables. Any suggestions as to which services should I use?
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Hi Draco, It sounds like using the glue catalog would be a good choice to register your data in as it handles drifting schema. You can use a crawler to automatically scan and identify the changes in the schema
@helovesdata8483
@helovesdata8483 2 года назад
If we clean the data after loading it into S3, this would be ELT right?
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
Yea you got it!
@danpefok3793
@danpefok3793 Год назад
Interesting! Great job but the author di not speak on Security. I think we need security too.
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Thanks Dan, you are right, I left out security. I would throw up KMS in the security section if individuals wanted to encrypt their data with kms keys
@suleimanumar258
@suleimanumar258 Год назад
Can you do the same but for Azure services?
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Hi Suleiman, sorry I'm not as familiar with Azure services. AWS is what I am currently focus on.
@josecarlossantos7673
@josecarlossantos7673 Год назад
What's the alternative to load data from curated zone to Redshit and Athena. Lambda + Glue or it isn't necessary?
@DataEngUncomplicated
@DataEngUncomplicated Год назад
When you say alternatives, do you mean AWS native alternatives? AWS announced an auto load feature from s3 to redshift I guess assuming that the schemas are the same.
@josecarlossantos7673
@josecarlossantos7673 Год назад
@@DataEngUncomplicated Yes. For example: lambda and EMR before curated layer and after from curated to redshift and Athena do you recommend any aws service to load the data?
@DaveThomson
@DaveThomson 2 месяца назад
Question: If you are pulling data from external API's would you use Glue to do this or would you use something else to get this infromation and store it in S3 first and then use glue to trasform the data in s3?
@DataEngUncomplicated
@DataEngUncomplicated 2 месяца назад
Great question Dave! I would recommend using lambda functions to ingest the data in S3. Glue is for processing large amounts of data and has a bit of a start up time. You will probably want a lambda function pulling data from your API frequently so the data load size would probably be relatively low.
@DaveThomson
@DaveThomson 2 месяца назад
Thanks, thats direction I went. I have a meta lambda, a datasource lambda (1 for each data source) and a s3 upload lambda. Using step functions. This way the meta lambda gets the customer infromation required, spawns parallel datasource lambdas which all pass data to s3 upload lambdas. ​ My only concern is how to best structure it in s3 for Glue. End goal here is Athena / Quicksite for BI purposes. I looked at AppFlow for some of this but hated it since I couldn't get all the object at one time and had to build an object per flow. So if a single data source has a lot of objects thats a lot of flows which seems annoying.
@DaveThomson
@DaveThomson 2 месяца назад
@@DataEngUncomplicated Thanks. I went with step fucntions. A master function that gets customer meta data and spawns functions for different datasources that all end up calling a s3 upload function. Now my only concern is am I storing the data properly in s3 for glue to make use of. Something like # Format the file name based on the current date and data type file_name = f"{data_type}_{year}-{month}-{day}.json" # Update the S3 key (path) to use the 'year=YYYY/month=MM/day=DD' partitioning convention s3_key = f"{customer_id}/year={year}/month={month}/day={day}/{data_type}/{file_name}"
@DataEngUncomplicated
@DataEngUncomplicated Месяц назад
@@DaveThomson I hope you figured this structure out by now, but if you want to use athena, you need to have your datasets seperated into different objects (folders) in S3. I would add a partitioning strategy as well which will save you in query costs if you know how your data will be queried.
@DaveThomson
@DaveThomson Месяц назад
@@DataEngUncomplicated Thanks!
@anoopkumar2142
@anoopkumar2142 2 месяца назад
hopefully zero ETL is going to change a major chunk of dependencies when managing the data within the aws ecosystem.
@makhus8337
@makhus8337 9 месяцев назад
can you do entire project for this?
@DataEngUncomplicated
@DataEngUncomplicated 9 месяцев назад
Yes, I have done projects using most of these services in the past.
@playingneutral
@playingneutral Год назад
aws is a not a career but just a cloud platform right? where we can put our skills and start working in cloud based environment right or not? pls clear me out that if i just directly with a non tech or no data anylytics background persue data analytics certification of aws but prepare through the learning material provided by aws and also hands on practice would i get the job easily? or i need to specialize all the 200 services? and also other python etc pls guide pls not getting answer to this anywhere
@DataEngUncomplicated
@DataEngUncomplicated Год назад
Hello, these are good questions that lots of people starting with AWS might have! Yes, AWS is just a cloud platform. I would say you should still have the foundational data analytics skillset in order to he succesful. You definitely don't need to specialize in 200 services to get a job. I would focus on learning the services that are relevant for a particular role. Nobody knows every single AWS service there is just too many. For your question about is AWS certification enough to get a job, it all depends on the role, the employer and what they are looking for. I would say it can't hurt your chances of getting a job if you are looking for a role that involves AWS.
@EthanDeng
@EthanDeng Год назад
Use AWS Batch to Batch Data Ingestion
@hunnidkray534
@hunnidkray534 Год назад
Is there a pdf file to print out the diagrams
@DataEngUncomplicated
@DataEngUncomplicated 3 месяца назад
No sorry, unfortunately I don't have one.
@nainaarabha9186
@nainaarabha9186 11 месяцев назад
I have one doubt. Can we host multiple kafka producers in one ec2 instance?
@DataEngUncomplicated
@DataEngUncomplicated 11 месяцев назад
Are you talking about using Amazon Managed Streaming for Apache Kafka?
@nainaarabha9186
@nainaarabha9186 11 месяцев назад
@@DataEngUncomplicated yes!
@rememberthename911g
@rememberthename911g 2 года назад
How would I encorparate AWS into my project if I am using a websites API as the source of my data?
@rememberthename911g
@rememberthename911g 2 года назад
Its not much data. A max of a couple hundred lines but I still want to be able to show an employer I can use different services
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
I think your asking how to ingest data from an api into aws? There are many ways to do this but for your purpose you can write a lambda function that uses the requests library to read data from the API and use the python library aws data wrangler to write the data to s3.
@rememberthename911g
@rememberthename911g 2 года назад
@@DataEngUncomplicated Thats exactly what I was asking, thanks. Can you make sure this sounds correct though? 1. AWS lambda to ingest data from API call and write that data to an s3 bucket 2. Read data from s3 using Python notebook file (that is using PySpark package) or read data from s3 using AWS EMR
@DataEngUncomplicated
@DataEngUncomplicated 2 года назад
@@rememberthename911g yup this works, you might want to define your data source in a glue catalog table so it will be more easily ingested into a glue job or pyspark job.
@hamalishah
@hamalishah 4 месяца назад
interesting
@DaveThomson
@DaveThomson 2 месяца назад
Do you do any consulting?
@DataEngUncomplicated
@DataEngUncomplicated 2 месяца назад
Hey David, I'm actually a full-time AWS D&A consultant for a company that is an AWS partner. Let me know if you want to chat.
@DaveThomson
@DaveThomson 2 месяца назад
@@DataEngUncomplicated I would like to chat. I too work full time for a partner.
@DataEngUncomplicated
@DataEngUncomplicated Месяц назад
@@DaveThomson Great, feel free to contact me through the email I have posted on my channel.
@DaveThomson
@DaveThomson Месяц назад
@@DataEngUncomplicated sent you an email.
@surendhirankrishnamoorthy6689
@surendhirankrishnamoorthy6689 4 месяца назад
As the channel name says you're making things uncomplicated. 🎉😅
@Naveen-hk3yh
@Naveen-hk3yh 10 месяцев назад
Great video I working as AWS data engineer from past two years overall experience is 11 years. Could you recommend what certification I have to do as data engineer confused as different types of AWS certification exists
@DataEngUncomplicated
@DataEngUncomplicated 9 месяцев назад
Hi Naveen! yea it is confusing because there isn't really a specific data engineering certification. The Developer associate and the AWS Data Analytics Specialty are the best one. I would also go after the database specialty if you think you will be working a lot with databases
@nj6553
@nj6553 Год назад
Millions or billions...
@DataEngUncomplicated
@DataEngUncomplicated Год назад
I have no context to what this means but I'm going to respond with we can process millions or billions of records in data engineer with AWS 😉
@user-br6oe3kf9k
@user-br6oe3kf9k 6 месяцев назад
Hi sir currently am learning sql and python I should start learning Big data am not knowing the proper way to start the way you were telling was so good so I felt like asking it will be really grateful if you please help me through this how can I contact you sir
@DataEngUncomplicated
@DataEngUncomplicated 6 месяцев назад
Hello I know there are a lot of concepts and technologies to learn! you can reach me at dataenguncomplicated@gmail.com
Далее
Practical Projects to Learn Data Engineering On AWS
8:04
АНДЖИЛИША в платье 😍
00:27
Просмотров 615 тыс.
Introduction to AWS Networking
30:02
Просмотров 422 тыс.
Top 50+ AWS Services Explained in 10 Minutes
11:46
Просмотров 1,4 млн
Top 10+ Data Engineer Interview Questions and Answers
13:18