Тёмный

Python Tutorial: Browser Automation & Web Scraping with Selenium - Part 2 

Make Data Useful
Подписаться 32 тыс.
Просмотров 11 тыс.
50% 1

In part 2 we auto-login with Selenium then use Python BeautifulSoup to scrape the contents of the pages to create a Pandas dataframe at the end.
Part 1 available here - • Mastering Browser Auto...

Опубликовано:

 

9 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 76   
@trinb1
@trinb1 4 года назад
I learnt more in 30mins than what I learned in 4+ hours from studying the same subject matter in a WebScraping book
@MakeDataUseful
@MakeDataUseful 4 года назад
Hi, Ben, that's really awesome to hear. I haven't uploaded in a couple of weeks but it's comments like yours that motivate me to get sharing!
@sarcasmasaservice
@sarcasmasaservice 4 года назад
Thanks for these tutorials, you have an excellent teaching style. I look forward to sharing your videos with my students as supplemental materials (and "scraping" them for potential assignment ideas). Keep up the great work!
@MakeDataUseful
@MakeDataUseful 4 года назад
Thanks for the positive feedback Joe, I really appreciate it! I hope your students enjoy the content!
@Zancb
@Zancb 3 года назад
Great video! I love your example use of Python code in each of those sections. Very helpful in visualizing the code being executed and the data being returned. Thank you very much for putting these together!
@toobstr
@toobstr 3 года назад
Just stumbled across your channel and this is some of the best content I have seen related to Python! Your teaching style is fantastic. I hope you keep making these videos. It would be cool if you explored making a video on using Selenium, scraping data, adding that data to a postgres db, making a UI that displays that db data. Oooor another thing I've been wanting to try is taking the data I scraped and adding it to an Airtable sheet for a quick and easy/shareable visualizer. Anyways, keep it up, really enjoying it!
@MakeDataUseful
@MakeDataUseful 3 года назад
Thank you! Awesome ideas!
@pursuing.perfection
@pursuing.perfection 2 года назад
Keep doing what you are doing this is GREAT
@thewilltejeda
@thewilltejeda 3 года назад
Definitely some of the best scraping tutorials I’ve found for sure ! I’m curious if you have anything planned for crawling with scrapy
@MakeDataUseful
@MakeDataUseful 3 года назад
Thanks Will! I'm yet to get using Scrapy but I'm thinking about putting a short series of "learn with me" videos sharing how I approach learning new packages and techniques to help make new topics stick.
@alexkotov2983
@alexkotov2983 3 года назад
Hi! Your videos are great, it helps me a lot. But one thing I still can't completely understand is how to work with network page in DevTools, how to pick an element you need etc. Would be really interesting to get a little bit deeper. Watched all videos so far and still couldnt find explanation.
@muhammadawon8164
@muhammadawon8164 3 года назад
Thank You for bringing high-quality education with super easy conceptual techniques for all levels of learners. Just a thing, where someone should ask questions for a specified problem regarding web scraping? Thanks
@SOFRADAKAOS
@SOFRADAKAOS 3 года назад
insane tutorial thanksss! also your beard looks really good
@sandeepyadav1478
@sandeepyadav1478 4 года назад
thanx for username and password. It still works ;-)
@socompsy
@socompsy 4 года назад
Can you give some pointers on how to not include your login and password directly in your code?
@MakeDataUseful
@MakeDataUseful 4 года назад
Hi Jimmy, I would suggest using environment variables. Setting them up is a little different by operating system so you may have to google your OS.
@sandeepyadav1478
@sandeepyadav1478 4 года назад
@@socompsy take it from your system text file, line by line
@originalkundukulangara9281
@originalkundukulangara9281 3 года назад
The part where you described calling the function login() and process_products()...Does this work in python IDE because when I use similar code in pycharm by getting all the code from jupyter notebook, it doesn't bring anything when I call login function
@lemontap7915
@lemontap7915 3 года назад
I like your beard, also your tutorials are awesome I have learned alot going through your videos. Thank you!!
@creapygames5731
@creapygames5731 Год назад
we need new tutorial the driver thingy was changed, but thanks alot for you content is really cool :D
@naziherrahel8609
@naziherrahel8609 3 года назад
Thank you so much 😊
@DenisAnzoategui
@DenisAnzoategui Год назад
Beautiful
@monicaguantay3480
@monicaguantay3480 3 года назад
Awesome!!!!
@tikendraw
@tikendraw 3 года назад
You deserve more.l
@snawfel1983
@snawfel1983 3 года назад
Thnks how would you login into an log.asp site ? I would like to scrape data behind ASP login plz help
@ShahzaibAnwaar
@ShahzaibAnwaar 3 года назад
Just came across your channel and went through all the videos this week. Top notch content mate. I noticed you haven't uploaded anything new for a while. Hope all is well. Will you resume uploading new content anytime soon?
@MakeDataUseful
@MakeDataUseful 3 года назад
Yes! And thank you! Getting my 2021 upload schedule together. Stay tuned!
@netbin
@netbin 3 года назад
hey 👋 do i need to raise up my hands every time i run login function?
@MakeDataUseful
@MakeDataUseful 3 года назад
💯 it's part of the fun! 😃
@elshroomness
@elshroomness 3 года назад
Omg! Such an awesome video! Quick question. Selenium has a known issue with drag and drop function. Do you know a work around for it? I've been stuck on this isssue for two months now.
@roberthuff3122
@roberthuff3122 Год назад
Thank you! Late, but did you ever release your code as a Collab resource? Yes, I am lazy.
@sinkingboat101
@sinkingboat101 11 месяцев назад
nice!
@youtubian855
@youtubian855 3 года назад
brilliant video, thanks for sharing
@renancatan
@renancatan 3 года назад
Hi, very useful! If I have java script content to scrape, how do I keep going instead of BS4, can you past something? I mean, in the html = driver.page_source and after bs4.. how do I do that to keep scraping with selenium or another language as helium for JS?
@olvid.o
@olvid.o 4 года назад
waiting next video... thanks a lot
@djordjevojimirovic6501
@djordjevojimirovic6501 3 года назад
Great video! Which IDE you are using in this video?
@jazer1370
@jazer1370 3 года назад
Im logging in to a website but it blocks me whenever im trying to login I think it needs cookie but I dont know how to use it maybe a tutorial?
@MakeDataUseful
@MakeDataUseful 3 года назад
Can do, if you're using Selenium it should just work straight away. Some websites have recaptcha to try to stop automated access.
@jazer1370
@jazer1370 3 года назад
@@MakeDataUseful im trying to login on nike snkrs but the error says cant connect to the server like that. Generic Post 0 like that
@MakeDataUseful
@MakeDataUseful 3 года назад
@@jazer1370 I'll check it out and get back to you
@jazer1370
@jazer1370 3 года назад
Thank you so much 🙏
@sandeepyadav1478
@sandeepyadav1478 4 года назад
I want to login into gmail, but having some trouble with specific id or class name. can u help?
@MakeDataUseful
@MakeDataUseful 4 года назад
Hi sandeepyadav1478, of course I can but first you'll need to provide a little more detail. You mentioned logging into Gmail, is there is a specific task you are looking to do once you have logged in? Also, what code/packages are you working with? Knowing these two things will help me tailor my response to your specific use case.
@sandeepyadav1478
@sandeepyadav1478 4 года назад
@@MakeDataUseful Ahh, i want to find all mails send to by 1 particular person. and i m using python with webmanager, bs4 and selenium same as u r using expect IDE, i m using sublimetext 3. but i can't login in my gmail account coz it didn't took my keys(userdata) .
@MakeDataUseful
@MakeDataUseful 4 года назад
@@sandeepyadav1478 this task might be better suited to using the Gmail API. I'll whip up a video for you 🤙
@sandeepyadav1478
@sandeepyadav1478 4 года назад
@@MakeDataUseful thanx man
@jonnyboii1005
@jonnyboii1005 3 года назад
Hey 👋thanks for the amazing tutorial. I've got a small question. After web scraping how do you place an order on items that meet a certain criteria?
@netbin
@netbin 3 года назад
is it possible to inport brave into sillynium?
@jakeleo1857
@jakeleo1857 3 года назад
Like as boos 👍
@ahmedelbon2755
@ahmedelbon2755 4 года назад
Thank you Adam. If it is possible in future uploads to link a .txt file in the description containing the code if it is possible.
@MakeDataUseful
@MakeDataUseful 4 года назад
That's a great idea, I'll make a Github for them all
@armannurhidayat7
@armannurhidayat7 4 года назад
Thanks sir👍
@MakeDataUseful
@MakeDataUseful 4 года назад
Most welcome
@user-vg4kj7mx2z
@user-vg4kj7mx2z 4 года назад
Thank you dear
@sandeepyadav1478
@sandeepyadav1478 4 года назад
Hey, i have encountered 2 more problem [not from the video code]. 1. How to take values of text changing div. (like after click it changes value) 2. my jupyter nootbook works different in every cell, like if i wrote imports 1st cell then i have to write in 2nd cell too. so i can run code in different cells like u. Thanxx
@alenjose3903
@alenjose3903 4 года назад
thats weird, you should definitely re install anaconda or just download the jupyter again
@justinleard7661
@justinleard7661 3 года назад
Great video. Spent half a day reviewing other videos of similar content and you went through it with the Kiwi accent plus straight forward approach. My challenge is after I login, my website redirects to the 'data' page. When I try to "driver.get('https:\\my.html.page') then it tells me my login is not correct. There is a remember me check box, which I have ticked in the code before trying driver.get. I suspect it is cookies but not sure on next steps. Any guidance would be appreciated mate.
@josuecrespo8386
@josuecrespo8386 3 года назад
Hey I started using python for web scraping since I saw your video. But I have reach a mental brain freeze cuz there is a epub book web that I like. And I have been trying to scrape from it with out any success. The web is VK website is you can point me in the right way I would appreciate it
@josuecrespo8386
@josuecrespo8386 3 года назад
Can you help or point me in the right way
@MarioLopez-eu8tj
@MarioLopez-eu8tj 3 года назад
I am the 2991 sub (Y).
@MakeDataUseful
@MakeDataUseful 3 года назад
Thank you! 3,000 new video! I promise!!
@snackers65
@snackers65 4 года назад
When I try to run the process_product function, I place in the html like you have done, but it spits back an empty list [ ]....I have been reviewing your code and I don't seem to see where I went wrong.
@alenjose3903
@alenjose3903 4 года назад
if you still need help, just mail me @ alenjose59@gmail.com. I had the same problem, its a quick fix
@sandeepyadav1478
@sandeepyadav1478 4 года назад
AHH, bro that site code have been altered little bit. So u have to select parent class then drive through child divs 1 by 1.
@alenjose3903
@alenjose3903 4 года назад
@@sandeepyadav1478 have to send a request to select the list representation, I wasted hours thinking what went wrong. Will never make that mistake again 😂
@alenjose3903
@alenjose3903 4 года назад
sandeepyadav1478 that is 1 way to do it, i just redirected to the page the RU-vidr used by tweaking the url.
@alenjose3903
@alenjose3903 4 года назад
sandeepyadav1478 i did that cos , i followed his videos and tested the code at the end. When we load the site from scratch it doesnt work.
@travelselects272
@travelselects272 4 года назад
Adam, big fan of yours here!. I've hit a brick wall. Trying to scrap pages over an API call; first page is no problem. I can't find a pages param in the API URL call. Any work around you can suggest?
@MakeDataUseful
@MakeDataUseful 4 года назад
Thanks for the feedback! Is the API one that appears in the network tab in chrome? If so, can you navigate to the second page in the browser and see what the API call looks like?
@travelselects272
@travelselects272 4 года назад
​@@MakeDataUseful Thanks Adam. I'm getting syntax error when I insert page variable like so {page} inside the search query url . I think it's my bad...let me sleep and recharge. ps! let me know if they is a way to share my code.
@MakeDataUseful
@MakeDataUseful 4 года назад
@@travelselects272 maybe double check you have f at the start of your string if you are using {}
@rafnishad3523
@rafnishad3523 4 года назад
give us some ETL video tutorials
@rafnishad3523
@rafnishad3523 4 года назад
ETL pipeline
Далее
Mastering Browser Automation with Python and Selenium
13:42
Python & MITMProxy: Web Scraping Secret iOS App Data
23:27
Web Scraping with Python - Beautiful Soup Crash Course
1:08:23