Тёмный

The Biggest Issues I've Faced Web Scraping (and how to fix them) 

ForrestKnight
Подписаться 544 тыс.
Просмотров 30 тыс.
50% 1

Try out Bright Data and get $15 credit for your projects! brdta.com/fknight
0:00 Problems I face web scraping
1:03 Web Scraping Basics Overview
4:38 Handling Complex Web Technologies
6:24 Script Optimization + Error Handling + Adaptive Algorithms
8:23 AI-Driven Proxy Management, Anonymity, and Intelligent Rate Limiting
10:23 How to Handle Extracted Data
12:22 Ethical AI and Legal Compliance
14:15 Thanks for Watching!
If you're a developer, sign up to my free newsletter Dev Notes 👉 www.devnotesdaily.com/
If you're a student, checkout my Notion template Studious: notionstudent.com
Don't know why you'd want to follow me on other socials. I don't even post. But here you go.
🐱‍🚀 GitHub: github.com/forrestknight
🐦 Twitter: / forrestpknight
💼 LinkedIn: / forrestpknight
📸 Instagram: / forrestpknight

Опубликовано:

 

19 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 56   
@delsix1222
@delsix1222 3 месяца назад
interesting timing to see this video, literally the day after I completed my first full-stack application which literally revolves around web-scraping :D
@flipygmd
@flipygmd 3 месяца назад
You're the next Mark Zuckerberg
@Noumaan_Ahamed
@Noumaan_Ahamed 3 месяца назад
How do you web scrape secure website?
@IshaqKhan010
@IshaqKhan010 14 дней назад
share website url
@delsix1222
@delsix1222 13 дней назад
@@IshaqKhan010 cant share url in yt comments, gets autofiltered
@panz__
@panz__ Месяц назад
In my opinion as i developed multiple web scraping application, half of the time is not spent coding but instead trying to reverse engineer the web application. Simple ones are just matter of looking at requests from dev tools and manually make api calls, while most complicated ones involve backtracing how content is loaded on the page to find the js code responsable to do that. Basically its 70% reverse engineering and 30% coding, if you do things the smart way.
@doublesushi5990
@doublesushi5990 3 месяца назад
such a chill vid
@sumukh007
@sumukh007 3 месяца назад
The JD bottle in the background 😉
@danielabraham3022
@danielabraham3022 3 месяца назад
To be honest, i subscribed because the button lit up. Also, I love your content.
@xlafxx
@xlafxx 3 месяца назад
I remember starting to watch your videos when I was entering computer science Ba, and as a 28 year old 1 semester left to graduate, you’re still uploading good content that’s unique. Never get tired of your vids , keep it up brother . I’m also concerned with the job market , can you make a vid about new grad Cs students ? For example seems almost every job wants front end or something and my school never taught any of it
@mrrobot-mn6re
@mrrobot-mn6re 2 месяца назад
You want to get a job from what your school taught you? You are in for a ride brother. Tech is about your own research and self learning, every fucking day.I pity people that majored in CS because they heard about a programmer earning 6figs
@Hshjshshjsj72727
@Hshjshshjsj72727 Месяц назад
Unless u went to ivy league and wanna be a quant then u gotta do front end js react sql are key for majority. School is duhm unless ivybleague except for piece of paper
@dalar2
@dalar2 3 месяца назад
I used to web scrape all the time, but stupid js frameworks obsfucated css class names has made it very difficutlt.
@redbill5197
@redbill5197 3 месяца назад
Thank you for the amazing video! Much appreciated as a young web developer. By the way, none of the buttons lit up or did any animations... I am a subscriber, so I don't know if that's why. Peace!!!
@beaconxy
@beaconxy 2 месяца назад
It actually didn't.
@EduardoEscarez
@EduardoEscarez 3 месяца назад
AFAIK the button highlighting is a feature based on video subtitles, including those generated automatically, but still somewhat random. I didn't catch those because I was already subscribed and like the video a moment before you said it.
@v1d300
@v1d300 3 месяца назад
I don't think its a video subtitles feature. It just happens randomly in my experience. The thumb up button shakes and subscribe highlights. Didn't happen for me on this video though :(
@olhodetamarutaca
@olhodetamarutaca 22 дня назад
I really like the way you explain things and also the pronunciation issues
@xdcountry
@xdcountry 3 месяца назад
This guy gets it-I’ve been there. I can’t wait to make this all an easy ass python plugin
@tomasemilio
@tomasemilio 3 месяца назад
Boom. Thanks
@Cryogenics12
@Cryogenics12 3 месяца назад
Hi Forrest. I was wondering how you still feel about AI and the future of software engineering. With chat GPT out for over a year now, have your views changed much? Maybe a good topic for another vid.
@brianmorin5547
@brianmorin5547 Месяц назад
Is there a reason/advantage to using Bright Data's "scraping browser" product instead of integrating their proxy and IP rotation services into a script I'm running on my own server?
@manumartinezkcxu
@manumartinezkcxu 16 дней назад
what are the best ai scraping apps : suggestion/recommendations? Just looking for how our nonprofit organization is aligned with other organizations within a county of california in order to partner with them
@carsonjamesiv2512
@carsonjamesiv2512 3 месяца назад
GOOD VIDEO🎉👍
@johnknox4293
@johnknox4293 2 месяца назад
interesting....thanks man
@olasunkanmioyetunji9254
@olasunkanmioyetunji9254 2 месяца назад
Can you recommend a course to learn web scraping. A course that taught the tool and techniques you mentioned and other concepts
@ramelox
@ramelox 3 месяца назад
When I see brightdata sponsorship, I instantly stop watching. Paying to brightdata is not a webscraping skill.
@zeddscarlxrd4331
@zeddscarlxrd4331 3 месяца назад
Did u know how to bypass cloudflare or captcha without bright data?
@ZacMagee
@ZacMagee 3 месяца назад
Some people 😂 That's like saying. "Oh well, these stupid people who drive cars, why would they do that when we still have horses?"
@vasyavasin7364
@vasyavasin7364 2 месяца назад
​@@ZacMagee why should I pay it if I can do it free?😂
@vasyavasin7364
@vasyavasin7364 2 месяца назад
​@@zeddscarlxrd4331 How to bypass cloudflare you can find easy.
@Ohiostategenerationx
@Ohiostategenerationx 2 месяца назад
​@@vasyavasin7364do you still not need to scrap a bunch of proxies to use?
@V4rrow
@V4rrow 3 месяца назад
dude is literally gilfoyle from silicon valley(love your vids)
@theparten
@theparten 3 месяца назад
i wasn't looking for web scraping video but his face drew my attention, i was like wait this is Gilfoyle right😂❤...
@FFl1s
@FFl1s 3 месяца назад
Fr
@v1d300
@v1d300 3 месяца назад
I am working on building a project that heavily requires scraping so I been doing a lot of research. And its really hard to find anything good that is not sponsored by brightdata. I get it, their marketing team has done a great job with tapping a perfect niche of creators who provide valuable information but this also creates a problem to ending up finding that almost each good resource is related to using brightdata and its not something I want to pay for when starting a hobby project. Anyway, this is a great video either way. I learned a lot of things I hadn't considered in my planning. Like the ETL(thats a new rabbit hole I need to dive into) or adaptive content extraction to account of layout changes. I was just assuming I will set up reporting to notify me when I start getting no content and then I will fix it. So thank you for that. Do you setup redis or something to make sure some requests are accessed from the cache of recently requested data than scraping again or accessing the db? is that necessary? And at what point should a webhook be setup and for what purpose exactly? Thank you
@dmytro-skh
@dmytro-skh 2 месяца назад
this video is what I need. But whoaa so fast changes of screens with code... I'm too old at 35 to be able to push the pause button so fast 😅 Do you have some links with those hacks?
@yafethtb
@yafethtb 2 месяца назад
Yeah. Scraping a dynamic website really makes me want to scream like Linus Torvalds to NVIDIA. And I also hate CloudFlare 😂
@javancheongyujing2531
@javancheongyujing2531 3 месяца назад
Is web scraping under data science or software engineering structure?
@phethindabamkhwanazi3546
@phethindabamkhwanazi3546 3 месяца назад
Hey, man do you have another channel where you teach live?????
@phethindabamkhwanazi3546
@phethindabamkhwanazi3546 3 месяца назад
If you have provide the link, please so I start learning more.
@JoaquimDornelles95
@JoaquimDornelles95 3 месяца назад
My fucking hero
@einekleineente1
@einekleineente1 2 месяца назад
are there vids of that ???
@realshiiiiiit8349
@realshiiiiiit8349 3 месяца назад
Damn this guy is cool
@francishubertovasquez2139
@francishubertovasquez2139 3 месяца назад
Speaking of Females, if Hitler's fuhrer have Magog carrier of motorized machine monsters then the Northern Magog have ice snow predominant in their place near Arctic circle, and ice surface can better conduct gases and science elements and compounds interaction which can attract those science things from everywhere, who between them is stronger except for the Super Magog Dark Matter? Will they suffice at full force during the final battle end times?
@storymode9085
@storymode9085 3 месяца назад
wow... i got a long way to go
@botobeni
@botobeni Месяц назад
12:30 nuh uh 🗿🗿
@VishalJangid1
@VishalJangid1 3 месяца назад
hopefully brightdata ain't a snitch 🫠
@user-ut4so1vy3b
@user-ut4so1vy3b 3 месяца назад
Your mustache looks like a hedgehog 😂
@YouStillNeedToSleep
@YouStillNeedToSleep 2 месяца назад
Examples. Are you a Leo? he he
@OnlyUseMeEquip
@OnlyUseMeEquip 17 дней назад
if you are using selenium,puppeteer, or any other browser automation, you will never be a good web scraper, they are just too damn slow, if you are relying on them to get you passed the WAF javascript function and generate your cookies for you to then go scrape others will beat you to the punch with pure code
@abe_is_live
@abe_is_live 3 месяца назад
stop web scraping
Далее
3 Types of Algorithms Every Programmer Needs to Know
13:12
Incredible Wheel Restoration Process 🚙
01:00
Просмотров 3,7 млн
glos bibir cokelat
00:18
Просмотров 5 млн
Can this capsule save my life? 😱
00:50
Просмотров 3,5 млн
Nobody Cares About Your Coding Projects
11:02
Industrial-scale Web Scraping with AI & Proxy Networks
6:17
So You Think You Know Git - FOSDEM 2024
47:00
Просмотров 993 тыс.
How Much Software Engineers ACTUALLY Make
18:11
Просмотров 86 тыс.
Web Scraping with ChatGPT Mentions is Mind Blowing!
8:42
40 APIs Every Developer Should Use (in 12 minutes)
12:23
Incredible Wheel Restoration Process 🚙
01:00
Просмотров 3,7 млн