AWS Glue Crawler Tutorial with Hands On Lab | AWS Glue Tutorials | AWS Glue Hand-On Tutorial

Ajay Wadhara

Подписаться 7 тыс.

Просмотров 37 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

13 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 79

@puru7190 2 года назад

Arguably one of the best videos on Glue crawler - thanks a lot!

@AjayWadhara 2 года назад

This means a lot to me😍 Don’t forget to subscribe, this pushes me to create more such content 🚀

@srbhmtr Год назад

Best video about crawlers

@khamkartejaswini8991 2 года назад

Very helpful for me . Thank you

@jagadishdamodaran7602 2 года назад

Great video Ajay . It's very clear and loud.

@rohanrawat5386 2 года назад

Best video i have seen on Glue Crawler. Thanks for your efforts.

@AjayWadhara 2 года назад

I am glad you liked the video. Don’t forget to subscribe.✅

@adityawalia9 2 года назад

This is extremely helpful. Thanks

@AjayWadhara 2 года назад

Thanks Aditya

@SurendraKapkoti Год назад

Very informative and useful..🙏

@RahulPatadiya Год назад

Nice explanation!

@aniketpatil4217 2 года назад

very helpful video, thanks.

@AjayWadhara 2 года назад

Don’t forget to subscribe ✅

@sumanthb3280 Год назад

Thank you so much. It's helpful...N

@Dean-Shepp Год назад

Thanks for the overview, this was really good.

@madhavpalshikar9412 3 года назад

Wow.. that was really helpful video! Thanks!!

@sabanaar 3 года назад

Excellent presentation !!!

@ganeshsundar1484 2 года назад

Ajay, do you take online sessions on AWS Data Engineering? Pls let me know.

@AjayWadhara 2 года назад

Hi Ganesh, sorry I don’t take any online classes.

@Namaryop 2 года назад

Really good tutorial, keep up the good work Ajay!

@AjayWadhara 2 года назад

Thanks Namaryop😊 Don’t forget to subscribe 🚀

@ashishdalvi2868 3 года назад

Very nice explanation. Thanks !!

@whatssnots 2 года назад

Great tutorial

@AjayWadhara 2 года назад

Glad you liked it✅

@rakeshkadulkar5219 3 года назад

Good and informative video for beginners.. the speed of content is very proper.. Thank you very much

@skadam3382 2 года назад

Thanks for the detailed information bro ❣️

@AjayWadhara 2 года назад

Always welcome. Do check my latest videos. Pretty exciting videos coming soon.

@viniciusfigueiredo6740 Год назад

Great video!

@viniciusfigueiredo6740 Год назад

Ajay, I need some help please! I've been trying all day and I can't. I cataloged a parquet file that was saved after handling a Job with Spark. I'm doing another job to insert data from the parquet into an RDS MYSQL database, I need the data to be inserted in the same order as the parquet to ensure the primary keys, I've tried several ways, but the data is always inserted in a random way in the database table, can you tell me what I can do?

@ishankolapkar3100 2 года назад

Very informative video. Where is the next part of this video?

@AjayWadhara 2 года назад

I am glad you fund this informative.🙌🏻 Please check next video in my channel.. Don’t forget to subscribe the channel ✅✅

@guesswho2306 2 года назад

Great video perfect explanation loved it! Keep it ip bro

@AjayWadhara 2 года назад

Glad you liked it!

@RohitPal-lz1wf 2 года назад

Nice Tutorial Ajay. I have one question. I have a requirement to copy data which is of 4 million records in Dynamo (2017 version) to another table (2019 version). I don't want downtime. Can you please suggest will glue help me in this usecase? If yes then what things i have to consider.

@AjayWadhara 2 года назад

Did you check DMS service.? You get the option of Change Data Capture also.

@TheGuyDuffBand 3 года назад

Thanks!

@Snkhuntia172 3 года назад

Helpful video ..😀

@TheCloudIsEverywhere 3 года назад

Hi Ajay - thanks for sharing this. Could you please all links related to Glue ( end to end hands-on flow ).

@kishlayamourya3141 2 года назад

Awesome video!! I have a query: - I wanted to push s3 data(csv) to redshift tables. Can I anyhow use table schema created by crawler to create table in redshift? In every tutorial instructor 1st hand creates a table in redshift, then uses crawler again to create schema in glue then pushes the data to reshift...then what is the use of creating schema using crawler?

@AjayWadhara 2 года назад

Hey Kishlaya, You have to try this. Just search if Glue Data Catalog can be used directly in Redshift. I am aware that Redshift Spectrum can directly use the schema created by crawlers

@chitrangsharma 3 года назад

Sir how will we perform etl operations ? I think pyspark is something that we use .... But do we have any other option then python?

@Ottone84 3 года назад

Well done

@krishnakumar-gc6hb 3 года назад

Hello Ajay, I saw your LinkedIn post of aws data analytics certification Pls explain us the detailed learning path that yoy have taken to pass pls make a video on this out would be very helpful to anyone looking to pursue that exam

@AjayWadhara 3 года назад

I will upload a video on Sunday. Stay tuned for that. Subscribe ✅✅

@krishnakumar-gc6hb 3 года назад

@@AjayWadhara okay Ajay

@krishnakumar-gc6hb 3 года назад

@@AjayWadhara any chances of coming about the video

@AjayWadhara 3 года назад

Coming in next 3 hours

@krishnakumar-gc6hb 3 года назад

@@AjayWadhara got it

@macklonfernandes7902 3 года назад

Can you do a video to add table from existing schema?

@udayreddy3653 3 года назад

Hello Ajay, Your videos was very helpfull. Can we get similar videos for AWS LAMBDA. Is it possible for you to put all your videos relates to AWS(S3, ATHENA, GLUE, KINESIS, LAMBDA, EMR) in UDEMY so that we can buy it for you. Please share your thoughts on this.

@AjayWadhara 3 года назад

I am starting with Lambda series soon..First video coming today...Stay tuned.. Don't forget to share and subscribe✅

@AjayWadhara 3 года назад

I am working on making one Udemy course..But it will take some time.

@udayreddy3653 3 года назад

@@AjayWadhara I already subscribed your series and shared with friends also.

@udayreddy3653 3 года назад

@@AjayWadhara Please have some working session also in udemy course for practise so that all our friends can buy your course.

@Sathyanathg 3 года назад

Hi Ajay, is there a way to automate data catalog import into Redshif Spectrum from AWS Glue ???

@jineshwarnoraje568 3 года назад

Nice explanation on AWS Glue Crawlers, which was very much helpful... Thanks for that. If any in between column is get deleted in newest file the the earlier file & the schema is modified by crawler, then in the earlier files the deleted column is available but the data got shifted ( as I can see the data is disturbed). So is there any configuration in crawler to validate the column names in any files available in S3 location.

@AjayWadhara 3 года назад

Thanks for adding that jineshwar. I did not explain that in this video.✅✅

@jineshwarnoraje568 3 года назад

@@AjayWadhara Is there any configuration to achieve such scenarios

@saurabhagarwal5692 2 года назад

Hi, Nice explanation on AWS Glue Crawlers, which was very much helpful... Thanks for that. I have some queries about GLue crawler and Athena First try : In my S3 bucket I have put two different files one is Stock table and other is employee table and run glue crawler. Two different tables are generated but with empty data. Is it correct ? Second try : In my S3 bucket I have put two different files one is Stock table and other is employee table and run crawler with Exclude patterns and mentioned employee.csv after that also single table is generated but data is merged from both table. Is it correct ? or I have done some thing wrong. Please let me know.

@AjayWadhara 2 года назад

Hi Saurabh, You have to segregate the data to two different folders. If data is not returning from query, better check if schema is matching in Glue Catalog

@chitraalavanthar3729 3 года назад

How will you load Postgres partition table to data lake ?

@GopalGanguly-i1g Год назад

I have upoaded a csv into S3 bucket .Crawler is creating the Data Catalog in Glue but when Im trying to view the content of the csv file in Athena using a query, its showing blank,but the columns are present without the values

@abhishekdubey-p9n Год назад

yes same with me if we do with single csv file then we can se the data but when we crawl multiple files from same folder it is showing blank pls help me out if you get the solution

@bujjinazeer 3 года назад

how about crawling in another account S3, can you show that too

@someusefullstuff 3 года назад

How to create grok classifier for fixed width files?

@5uryaprakashp1 2 года назад

Hi Ajay, Is there any way to automate through CI/CD, like I wanted to upload bunch of crawlers files and trigger them automatically and then store inferred schema in local file system. Thanks in advance.

@AjayWadhara 2 года назад

Hi Surya, You can CI/CD to automate things. Also consider Scheduled Lambda function. You can upload your files the either trigger or schedule the processing. Hope this helps!!

@silpadas7070 2 года назад

can you do more advanced examples, please?

@ganeshrajv130 2 года назад

do crawler work on images ? because i tried but didnt get any data in data catalog

@AjayWadhara 2 года назад

No, there is a list that you can find on aws Docs website. Common types such as Csv, tsv, DBs, logs, json, parquet are supported. You can write custom crawler also but that would not cover images. Try using AWS Rekognition

@ganeshrajv130 2 года назад

@@AjayWadhara okk thanks.. So even custom classifier wont support images right.

@turabaliyev 3 года назад

Hi Ajay, I uploaded the CSV file to the S3 bucket and created a crawler, I see the table in the database but trying preview data I don't see the data in the table, could you please let me know why I don't see the data?

@AjayWadhara 3 года назад

Are you querying through Athena console?

@turabaliyev 3 года назад

@@AjayWadhara yes

@AjayWadhara 3 года назад

Check my Athena video, you will definitely get help from that