Тёмный

Web Scraping With Selenium And A Raspberry Pi - All You Need To Know 

Tinkernut
Подписаться 626 тыс.
Просмотров 69 тыс.
50% 1

The web and it's websites are complicated. So basic web scraping just won't cut it when it comes to things like logins, forms, and pagination. Well, let's learn how to get what we want using Python, Selenium and a Raspberry Pi.
_____________________________
📲🔗🔗📲 IMPORTANT LINKS 📲🔗🔗📲
_____________________________
• 💻PROJECT PAGE💻 - github.com/gigafide/basic_pyt...
• PREVIOUS VIDEO - Beginners Guide To Web Scraping with Python - All You Need To Know
• quotes.toscrape.com
• stackoverflow.com/questions/6...
_____________________________
💰💰💰💰 SUPPORT THE SHOW 💰💰💰💰
_____________________________
www.tinkernut.com/donate
_____________________________
📢📢📢📢 Follow 📢📢📢📢
____________________________
redd.it/5o3tp8
/ tinkernut_ftw
/ tinkernut
/ tinkernut

Опубликовано:

 

25 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 78   
@SimSimsTECHcrunch
@SimSimsTECHcrunch 2 года назад
The RU-vid legend has returned again!!!!
@UserUnknown07
@UserUnknown07 2 года назад
Can't imagine the amount of editing this video must have took, woah! Great explanation. Thank you.
@timothycain8639
@timothycain8639 2 года назад
love this project. you made many aspects of programming with python INFINITELY MORE CLEAR TO ME.
@lachlanmoore2345
@lachlanmoore2345 2 года назад
Use Explicit Waits when you can instead of the time module, Expected Conditions are great for this.
@cyrustakem7993
@cyrustakem7993 Год назад
I miss your videos, i don't know why youtube stopped recommending them, they are highly educative
@twys124
@twys124 2 года назад
Great explanation and great video. I just learned about web scraping w BS4 and selenium.
@AS-fj7ox
@AS-fj7ox 2 года назад
Good work dude.. keep it runnin!!
@thehoneyseals
@thehoneyseals Год назад
This made me so happy thank you so much . you have no idea
@johnbushur6080
@johnbushur6080 2 года назад
Very useful. I came across selenium a while ago but wound up using excel tools instead. I’ll have to give this a try for my next project.
@2mrRB
@2mrRB 2 года назад
Hey John, are you able to use excel tools to scrape websites too? Or do you mean something else? Thanks in advance :)
@johnbushur6080
@johnbushur6080 2 года назад
@@2mrRB I’ve used Excels web/power query for this in certain cases. Check out Leila Gharani’s channel for some good tutorials. I’ve also written some scripts in VBA to do it as well for specific tasks. That is what I meant by excel tools. Hope that helps.
@mejia414
@mejia414 2 года назад
Gracias desde Colombia me ayudo mucho tu video
@NitishKumarIndia
@NitishKumarIndia 2 года назад
This guys belongs to the golden age of RU-vid when the things were simple.
@JasonOBrienThinksHeCan
@JasonOBrienThinksHeCan 2 года назад
Awesome!
@jasonbailey9139
@jasonbailey9139 2 года назад
We had a Perl script that we used to scrap data off of a website. They changed the way the login worked and Perl didn't support the new method (OK, it probably does, but I hate working with Perl scripts, so I I didn't bother researching after our consultant said it didn't), so I just made the users start doing the scraping manually. Now I'm tempted to give this a try to start scaping that data again.
@NightRider0101
@NightRider0101 2 года назад
Python requests and beautiful soup are the best tools for scraping
@VisesEntei
@VisesEntei 2 года назад
Welcome back.
@mmuneebahmed
@mmuneebahmed 2 года назад
Awesome, thanks! Will this selenium library also work with any social media websites or do we have to use other libraries in conjunction to selenium?
@VikashXman
@VikashXman 2 года назад
Thanks man
@lukasdegle8313
@lukasdegle8313 2 года назад
Like it a lot! But why don't you use a context handler while writing to files? :)
@abrandnewcompany
@abrandnewcompany 2 года назад
Beautiful soup combined with request can do everything what you want, even more than selenium. But I didn't know the NoSuchElementExist Try and catch which is really handy indeed I always use to program it myself a function like that. Thanks!
@OnixEdge
@OnixEdge 9 месяцев назад
@Tinkernut Do you have any tips on how to keep the webdriver updated if you are using the pc and chrome?
@papusa9878
@papusa9878 2 года назад
Good video
@CodingWithBen
@CodingWithBen 2 года назад
I literally just watched your last video lol. How do I know whether it is allowed to scrape a website or not. Is there an easy way?
@jemalguillory
@jemalguillory 2 года назад
New drip!
@gkchimzz28
@gkchimzz28 2 года назад
nice
@paulmagu3054
@paulmagu3054 2 года назад
Selenium is very useful. Any ideas of running web-scraping on the server side with selenium preferably? (Other libraries in python or Node are welcomed suggestions!) thx.
@domasberulis
@domasberulis 12 дней назад
what are your rpi specs? Mine 1gb ram RPI 3B takes 3 minutes to launch the browser
@webslinger2011
@webslinger2011 2 года назад
For hiding username and passwords I use config parser to grab from a separate file. What I haven’t figured out is how to use proxies to avoid bot detection. Sorry for the hijack but I need to ask. Anyone with a good tutorial? Thanks!
@NightRider0101
@NightRider0101 2 года назад
You can use proxy cycling.
@leader1944
@leader1944 2 года назад
Proxies would work great to avoid detection if you are sending a large amounts of requests to a site very quickly. However, some sites can detect that you are using an automation software by checking for a string when you send your request with webdriver. This string is $cdc_ and it’s located in the webdriver exe file using a hex editor you can replace $cdc_ with any other string that contains $ at the beginning 3 letters of any kind and then an _ at the end. For example $dog_. Note: Changing $cdc_ only works if you are on chrome otherwise you need to change a different string. Hope this helps :)
@d-rey1758
@d-rey1758 Год назад
where in the video did you mention running this on a raspberry pi?
@JNET_Reloaded
@JNET_Reloaded 26 дней назад
drivers dont work on rpi 5 that well with new supported borowsers so we need real automation without selinium bs any ideas? it should be able to take screenshots and click mouse and use brave browser i got it doing a lot of this stuff but still needs work can you make a video about doing this for rpi 5 using latest brave browser on raspian os debian???
@dontbelasagna5968
@dontbelasagna5968 2 года назад
my csv keep separating the string by characters.. like, the word "the", in csv it is t in one cell, h in the cell next to it, and e in the next one as well..how do i fix this
@OffGridAussiePrepper
@OffGridAussiePrepper 2 года назад
hahahahaha ur the pun king today :)~
@AliAli-rj9qb
@AliAli-rj9qb 2 года назад
sorry i was missing an s in find_elements so now it is working
@AliAli-rj9qb
@AliAli-rj9qb 2 года назад
if I use bs4 it works fine but with the selenium i get TypeError: zip argument #1 must support iteration. the program is exatly the same as yours so why do i get this error
@mefaun
@mefaun 2 года назад
Yay now I can be Thomas Anderson in the Matrix
@100996julen
@100996julen 2 года назад
I'm planning to do a web Twitter-scrapper program with Python. Which raspberry pi modek is better for it? I want to buy the cheapest that I can. Thanks!
@randomhominid9816
@randomhominid9816 2 года назад
Why not just use your desktop or laptop computer? A raspberry pi isn't needed but if you want one the rpi 4 with 2GB will probably be enough but maybe get the rpi 4 with 4GB to make sure you have enough memory as browsers tend to use a lot of memory.
@arjix8738
@arjix8738 2 года назад
It's much better to sign up for the twitter API
@spumeeuw430
@spumeeuw430 2 года назад
I am running into the following issue when trying to install the chromedriver: "E: Unable to locate package chromium-webdriver". Has anybody run into this issue before?
@Sokar599
@Sokar599 2 года назад
How about puppeteer, isn't that the standard nowadays? Good tutorial als always.
@Tinkernut
@Tinkernut 2 года назад
I thought puppeteer was developed for node.js. Is there a python branch too? Selenium is the OG, that's why I went with it.
@Sokar599
@Sokar599 2 года назад
@@Tinkernut Ah yes indeed, I don't often use python I guess. Good to see you're still uploading videos! I used to watch you as a kid all the time. Thanks for educating :)
@serhiyranush4420
@serhiyranush4420 2 года назад
I am running this script on Windows 7 machine and it works beautifully. However, when running from Thonny, no password prompt appears in the Thonny's console. However, when launching it from the command line window, the password prompt does appear. How can it be fixed for the password prompt to appear in Thonny?
@jyvben1520
@jyvben1520 2 года назад
in the console window or did you expect a gui popup window
@serhiyranush4420
@serhiyranush4420 2 года назад
@@jyvben1520 No, I didn't expect a GUI popup window. But I did expect a console prompt, as at 6:42 in this clip.
@Illvidri
@Illvidri 2 года назад
I see the next button and I think "He's just scraping the surface"
@CrjaseMechaEngr
@CrjaseMechaEngr 2 года назад
requests could of done this
@Pod-Z
@Pod-Z 2 года назад
Holy shit you listened to my comment
@thekevalpanchal
@thekevalpanchal 2 года назад
Hello
@sarthoknextt5150
@sarthoknextt5150 2 года назад
Have you worked as a QA in the past?
@myriadtechrepair1191
@myriadtechrepair1191 2 года назад
You can scrape my web anytime, pun man.
@dunste123
@dunste123 2 года назад
Not enough dad jokes :P
@4crafters597
@4crafters597 2 года назад
Anyone has a solution to sending the password without including it in code?
@userz111
@userz111 2 года назад
Seperated config file Or Use/save-load browser profiles
@otmw6726
@otmw6726 3 месяца назад
thanks for not explaining how you found the identifier for the log in button
@mohmedbadr1947
@mohmedbadr1947 2 года назад
You are late to the party my friend. Most of the website we want to automate or scrap have some antibot
@Tinkernut
@Tinkernut 2 года назад
I can see how that may be true for you, but not in general. Most popular websites (twitter, wikipedia, imdb, amazon, youtube, etc) have no such measures. It depends on the website and what they allow. If they have antibot precautions in place, then it's probably not legal to scrape that site anyway. I'm trying to avoid legal issues with this video.
@nibblrrr7124
@nibblrrr7124 2 года назад
​@@Tinkernut IANAL, but in the US, *merely violating some corporate website's terms of service is not illegal* _in itself._ See e.g. the EFF's reporting on Oracle v. Rimini 2018 which actually involved scraping. _(Ninth Circuit Doubles Down: Violating a Website’s Terms of Service Is Not a Crime)_ Naturally, I completely understand that you'd want to steer clear of legal issues on your channel, though. (Thanks & keep up the great work, BTW!)
@mfawzi89
@mfawzi89 2 года назад
Can I use this code to hack the username and password 😌
@woodenbeast9337
@woodenbeast9337 2 года назад
what do you gain by scrapping data? Is this useful?
@yetzt
@yetzt 2 года назад
data journalist here. yes, scraping is useful if the data you need is not provided any other way. and often times it is not.
@TheOnlyRaichuu
@TheOnlyRaichuu 2 года назад
I'm a freelancer web scraper. There are so many clients. So yes, this is useful. Data is knowledge you can turn into profit. Think about big data companies like Google for example.
@woodenbeast9337
@woodenbeast9337 2 года назад
​@@TheOnlyRaichuu It just teaches how to strip our privacy and profit off selling very sensitive data. Running a for profit hack
@TheOnlyRaichuu
@TheOnlyRaichuu 2 года назад
@@woodenbeast9337Why are you asking when you already made up your mind beforehand? What you're saying is absolutely wrong and ridiculous. How does it hurt your privacy when a car dealership wants to get all the data of car listings with their details and price tags to optimize his own pricing? Is anyone affected now in the own privacy? No.
@woodenbeast9337
@woodenbeast9337 2 года назад
@@TheOnlyRaichuu weak comparison
@yetzt
@yetzt 2 года назад
whats up with your sound? it sounds like its out of sync with itself. also i'd recommend going with puppeteer and node if one was more comfortable with js. it just integrates better.
@dudds6699
@dudds6699 2 года назад
Web Scraping with Selenium I know it can be done but its the wrong tool for the wrong job.
@SeaJay_Oceans
@SeaJay_Oceans 2 года назад
That is very Edgey comedy...
@gmog7857
@gmog7857 2 года назад
Who do you think you are talking to? python experts?
@nibblrrr7124
@nibblrrr7124 2 года назад
Curious people with access to a search engine, motivated to build something they want? :^) If you tell me what it is you'd like to do, what you tried, and where you got stuck or have questions, maybe I can help you or point you in the right direction.
@drewmillett2089
@drewmillett2089 28 дней назад
@@nibblrrr7124 Hey I would enjoy some help if you still read these comments. I think I'm getting stuck on pointing Selenium to the correct browser driver path. If I right click on Chrome it shows a path of the executable file but I'm getting webdriver errors when I use this line of code: browser_driver = Service('C:\Program Files (x86)\Google\Chrome\Application\chrome.exe') . I didn't really see how tinkernut came up with the path...
@astemet
@astemet 2 года назад
i got discord bot ready
@Dikkedimi
@Dikkedimi 2 года назад
dude, your audio is real bad. all over the place.
Далее
DIY Device Detects Objects With Sound
6:46
Просмотров 24 тыс.
Olive can see you 😱
01:00
Просмотров 5 млн
NAME THE EURO 2024 PLAYER OR SWIM 💦
00:35
Просмотров 11 млн
МОЙ БРАТ БЛИЗНЕЦ!
19:34
Просмотров 1,1 млн
Raspberry Pi 5: EVERYTHING you need to know
20:32
Просмотров 1,1 млн
Selenium Headless Scraping For Servers & Docker
16:22
Arduino - All You Need To Know
6:09
Просмотров 25 тыс.
Web Scraping Databases with Mechanical Soup and SQlite
19:19
The Raspberry Pi 5 is a $80 Gaming Beast
14:56
Просмотров 580 тыс.
Choosing the right Raspberry Pi for you!
11:10
Просмотров 85 тыс.
LCD Basics for the Pi Pico
7:31
Просмотров 109 тыс.
Olive can see you 😱
01:00
Просмотров 5 млн