Тёмный

How to Scrape Websites with GPT-3.5 (Web Scraping with ChatGPT) 

The PyCoach
Подписаться 39 тыс.
Просмотров 292 тыс.
50% 1

Опубликовано:

 

28 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 164   
@ThePyCoach
@ThePyCoach Год назад
If you'd like to know how I'd use ChatGPT to learn to code, check out my new video 👉ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Eh_KovOmQRQ.html
@AEON.
@AEON. Год назад
Can you make a video of getting ChatGPT to Google Dork or to write a code to do so for any website?
@lhcsouth
@lhcsouth Год назад
This is awesome, thank you! I have been trying to use this video to help me scrape certain aspects of flippa and apparently the commands I'm giving ChatGPT aren't correct. lol I want to scrape flippa every 5 minutes after the hour and I want it to scrape the site for site name, type, industry, monetization, age, net profit, and the site description. Any help would be highly appreciated. Thank you!
@ollieroddy
@ollieroddy Год назад
Thanks for posting the video. Is there any way to do this for a site that requires a sign in? I have a sign in for a site and want to use a scraping tool to scrape all of the data from it. The data is behind a click, i.e. you have to click the listing in the contacts database to see the email address. Any idea how to do this?
@oxtharhino9784
@oxtharhino9784 Год назад
what website did you past that in to get the test to print?
@mrtn5882
@mrtn5882 Год назад
So basically you tell ChatGPT all the things that you'd tell any other web scraping tool as well. How genius! ChatGPT is now as intelligent as Scrapinghub was 10 years ago. 😅
@thesteffan5766
@thesteffan5766 Год назад
hich free tool would u recommend?
@maskman4821
@maskman4821 Год назад
Wow, this is really amazing, thank you for showing us how to tell ChatGPT to scrape the web page to get the content we need, this is a valuable lesson, than you so much 🙏
@ThePyCoach
@ThePyCoach Год назад
Due to the many questions and comments about ChatGPT and OpenAI Playground, here are some notes. 1. Yes, ChatGPT and OpenAI Playground are not exactly the same. That said, if you use the prompts I created in this video on ChatGPT, it'd generate code that scrapes the website just like with Playground. 2. Why did I use Playground instead of ChatGPT? Speed. I only wanted to get the code and not the bla bla bla you get with ChatGPT. I like the explanation ChatGPT gives, but, when recording the video it's a bit annoying. 3. Some people say Playground isn't free. At least, it was free for me at the moment I recorded the video. I never gave credit card information and it's still working fine for me today. 4. ChatGPT/Playground currently generates code for Selenium 3 and not for the last version (Selenium 4), so keep that in mind when using its generated code.
@ESSARGEE
@ESSARGEE Год назад
this is insane! TY for the videos, chat gpt3 is basically the next internet lol
@abhijeetdhumal7385
@abhijeetdhumal7385 Год назад
Bcoz it doesn't know about world after 2021
@maskman4821
@maskman4821 Год назад
@@abhijeetdhumal7385 yeah, I wonder why openAI team does this, if AI can't tell us the latest info, then it is no better than Google, you know 🤔
@sheetai
@sheetai Год назад
The SOME REASON is it was trained on data on the internet, there is more data about the older version than the new one, have you tried asking it specifically to write for selenium 4?
@Privacyking
@Privacyking Год назад
@@maskman4821 so that u will pay subscription when they launch the latest dataset :P
@navin346
@navin346 Год назад
The projects shown in this video were once used to be to pass a whole examination or certification
@pavel9652
@pavel9652 Год назад
Twitter loads only so many tweets at once, so when you scroll too far, it will remove the objects from HTML on the top. I learned it when I was searching for something with ctrl+f in the browser.
@mamneo2
@mamneo2 Год назад
Incroyable.
@NiruBhat1
@NiruBhat1 Год назад
@@mamneo2 Excroblagaxyable
@disrael2101
@disrael2101 Год назад
Best video of 2023 so far
@annapetmikel4356
@annapetmikel4356 Год назад
you are SO good. Learn much from you
@joshsowin
@joshsowin Год назад
You’re using “text-davinci” GPT model but “code-divinici” will probably work better for these kinds of things and will work more consistently
@Aru8675
@Aru8675 Год назад
Hey brother can you tell me why i am getting only [ ] this in output whenever I try to scrap data of some websites? Does this happen due to they use javascript? Or it happens because javascript is not supported by beautiful soup? So is selenium best for all types of javascript, css and html websites? Also please make a video or provide your email i am facing issues in installation of chrome webdriver.
@txreal2
@txreal2 Год назад
Got error at Playground OpenAI You've reached your usage limit. See your usage dashboard and billing settings for more details. You got to pay to play :(
@FaycalAZIB
@FaycalAZIB Год назад
I'm stuck to scrape data related to this query on Google maps : Hotels near city name and getting the price of each hotel in detail view
@Mesut.e
@Mesut.e Год назад
It's a very simple and understandable educational video, thank you... I have a few questions. Is it possible to scan current ASINs (product identification numbers used by Amazon)? Of course, I need to enter location information to fetch data from the Amazon website. My other question is, how can I select the current profiles in my Chrome browser while using chromedriver?
@kottinaresh89
@kottinaresh89 Год назад
But chatgpt says it cannot connect to internet, no real time access as its an ai model. Is this something other than chatgpt?
@DJPapzin
@DJPapzin Год назад
Thank you for this great tutorial. I tried to scrape on tweeter by scrolling but i always get the same results
@MangalFaisal
@MangalFaisal Год назад
what if I want to scrape the movie title, but also, the info inside each movie/url that leads to the info of the movie?
@DVSHORTS21
@DVSHORTS21 9 месяцев назад
Hello bro how to scrape a running website which will change all the time
@shameemahamed1031
@shameemahamed1031 Год назад
Everyone says the name as Amazon at the start but the tutorial is about a different website scraping. The real part is you can not scrape Amazon content easily and if you go by this same method it'll fail as the DOM element won't be available and no one is telling you this.
@JesusChinoMagris
@JesusChinoMagris Год назад
Great tutuorial. Thank you. Question, i am trying to create a chrome extension to scrape products in every category that have no videos. Will this be possible using this same approach? TIA
@learnwitharbia3477
@learnwitharbia3477 Год назад
thank you so much, that's really helpful!
@TheTalidi
@TheTalidi Год назад
Thanks Please there is any method to know keyword that use on some products on amazon with ChatGPT??
@El.Desarrollador
@El.Desarrollador Год назад
I love you so much I really love you months and months trying to figure it out finally I was able to complete a project
@avamaria8447
@avamaria8447 Год назад
what project have you been working on months and months?
@aiexplains
@aiexplains Год назад
You can tell chatgpt to scrape for example items in items name just leave the selector empty then you can just fill out the selectors
@subhamsaha2235
@subhamsaha2235 Год назад
Thankssssssssssssssssssssssssssssssssssssssssssssssssssssss, for the first time I scraped a website by writing the exact code and also understood it completely, what was happening inside it
@futuregootecks
@futuregootecks Год назад
Bro this is genius!!! Thank you
@Comic_Book_Creator
@Comic_Book_Creator Год назад
I need to scrap video links from a website? can I do ?
@smanzoli
@smanzoli Год назад
Playground is another AI, a much less capable one (davinci)… NOT ChatGPT.
@simpsonsmagazine
@simpsonsmagazine Год назад
Davinci is actually a bigger model that’s more capable but you need to give it more instruction
@bezillions
@bezillions Год назад
Chatgpt is gpt3 3.5 fine tuned to have discourse, "playground" is all of their models which are all based off go t3.5 at the moment and fine tune to do specific things like codex writes code, davinci does embedding and generation of tex
@MohammedAli-tq8ln
@MohammedAli-tq8ln Год назад
Yes he keeps saying ChatGPT.. anyway his tutorial is very useful 🌹
@xWe2s
@xWe2s Год назад
No, it's not. The text davinci 003 is the latest model, which actually is the so called gpt3.5 and ChatGPT. 🤦‍♂️
@smanzoli
@smanzoli Год назад
@@xWe2s Chat GPT is NOT the same da vinci you get in the Playground by ANY means.
@filmydhamaka4401
@filmydhamaka4401 Год назад
After getting the script of code. Where did you pasted it to run?
@AEON.
@AEON. Год назад
Can you make a video of getting ChatGPT to Google Dork or to write a code to do so for any website??
@Oliverqueen
@Oliverqueen Год назад
will this work on gambling websites that provide live stats? could you code it to constantly check the website for updated info? or would you need auth api for that.
@baghi536
@baghi536 Год назад
Thank you!
@mnali
@mnali Год назад
Thank you for the tips
@Mshahzaibkundi
@Mshahzaibkundi Год назад
Can you do that product asin and product price like products finding for Amazon fba
@imhugeinjapan7
@imhugeinjapan7 Год назад
What can you do once you scrap a list? Like can you show more possibilities?
@NiruBhat1
@NiruBhat1 Год назад
you can save it!!!!!! & delete, recyclin bin - restore and again again, infinite possibilities instead of reading directly from the web!
@iammcqwory
@iammcqwory Год назад
Asante Sana
@kevinkurnia2259
@kevinkurnia2259 Год назад
thanks bro i need this
@nichandesign
@nichandesign Год назад
Can you make chatgpt learn something about the actual data and updates by giving it informations
@kamel3d
@kamel3d Год назад
Great video, could you do an example to scrape facebook pages and get post? I was trying this the other day and it seem there no way to scrape public data from fb
@ThePyCoach
@ThePyCoach Год назад
I’ve never tried to scrape fb. What’s the issue? There are websites like LinkedIn though that have strong anti scraper systems that make web scraper very challenging. As I mentioned in another comment, this AI tool gives you the code, then you have to come up with something like proxy rotation to deal with anti scraping systems
@soccergalsara
@soccergalsara Год назад
bit weird
@aleksd286
@aleksd286 Год назад
I think I would be more quicker with Copilot
@xWingzTV
@xWingzTV Год назад
What is copilot?
@aleksd286
@aleksd286 Год назад
@@xWingzTV Microsofts AI tool specifically for programming. Been using it since the beginning
@chuckrussell3804
@chuckrussell3804 Год назад
What if I need to Log in first?
@DerClaudius
@DerClaudius Год назад
"loophole" ....
@ghayethbouraoui50
@ghayethbouraoui50 Год назад
what about scraping with facebook????
@willowwood6798
@willowwood6798 Год назад
:) now I have to Google scrape website so I can know what he is doing…Context gets me 60% there.
@Wolfoffreedom111
@Wolfoffreedom111 Год назад
Dude you can scrap with beautifulsoup, why you saying you can’t scrap amazon with it lol
@th-xd9xv
@th-xd9xv Год назад
Great video! Is this also usefull for Instagram? They have a very strong Bot detector.
@ThePyCoach
@ThePyCoach Год назад
The tool simply gives you the code. The rest is up to you. In this particular case, you’d probably need to rotate proxies
@th-xd9xv
@th-xd9xv Год назад
@@ThePyCoach yeah thought so. Thanks for clarifying. Love your content. Keep it going!
@schmetterling4477
@schmetterling4477 Год назад
That's cool... lets enable blatant copyright violations with AI. ;-)
@NiruBhat1
@NiruBhat1 Год назад
Change heading as "How to scrape any website content using Python"
@akramxgamer6574
@akramxgamer6574 Год назад
what the idea
@arnabchatterjee1611
@arnabchatterjee1611 Год назад
Script?
@ThinkwithLex
@ThinkwithLex Год назад
I think typing code is much faster
@mrbinix3573
@mrbinix3573 Год назад
Cant scrape twitter no more
@iamisobe
@iamisobe Год назад
Amazon Ip bans for this
@mohamedde2583
@mohamedde2583 Год назад
Chomedriver ain’t working no more ?
@howarja
@howarja Год назад
You need to have the Chromedriver version that matches your Chrome Browser release version. So, as your Chrome browser updates, you'll need up update your Chromedriver.
@johnpower1458
@johnpower1458 Год назад
Use Mozilla driver
@1996Pinocchio
@1996Pinocchio Год назад
Just tell ChatGPT what error you are getting, and it will tell you what you could do.
@MyBakedTurtle
@MyBakedTurtle Год назад
I’m tryna make a program where I can give a python script a bunch of top 20-50 ad agency website list and then it gets each of those websites and then scrapes for all their email so I can contact them offering my video editing services
@israelp
@israelp Год назад
Up
@vishalharihar238
@vishalharihar238 Год назад
How can we ask ChatGPT to do pagination and scrape data for multiple pages within the same website?
@ThePyCoach
@ThePyCoach Год назад
The pages are usually in a navigation bar ( tag). Just tell ChatGPT to locate an element that has the nav tag and add the class name of the element.
@pavel9652
@pavel9652 Год назад
The ai will replace programmers bro, they say. Just ask on chat overflow how to write English question ;) I didn't intend to be mean, just found it funny ;)
@Business099
@Business099 Год назад
Bro scrap telegram
@proxyscrape
@proxyscrape Год назад
@oxylabs we better watch out. ChatGPT is going to put everyone out of business. 😁
@HQ-OnlyFans-Traffic
@HQ-OnlyFans-Traffic Год назад
Can this work on a website that requires log in to access any data I want to scrape? Thanks!
@Sammyli99
@Sammyli99 Год назад
whats more amazing is guess how much the Ai has grown in the last 4 weeks....now it knows what every hustler and hacker wants...feeds it back to: Google, Facebook, Insta, and the FBI and hey presto, secured loop. All for a fee sure, MS servers don't run themselves.
@magicsmoke0
@magicsmoke0 Год назад
It loses a bit of usefulness when you have to go look at the source and figure out for the ai where and how to scrape. Is there no ai that can "search" the html structure for the content you want?
@psocretes8183
@psocretes8183 Год назад
I have never uderstood how bots work. Like many people I have dabbled with code but haven't needed to go further. I didn't realise that the classes and IDs etc were to help others acess your ste data. It all makes so much more sense. Thanks.👍
@davidinark
@davidinark Год назад
So how do you tell it to scrape the subsequent pages?
@TheWorldFamousBeaverpedia
@TheWorldFamousBeaverpedia Год назад
I just asked the same thing two months after your question. I guess he either missed your question or I just won't get an answer. Your question is all I really care about :)
@seanzhang3873
@seanzhang3873 Год назад
good video, I think this is going to be the future of coding. Human don't need to do low-level coding, but giving high-level instruction.
@KCM25NJL
@KCM25NJL Год назад
That low-level code you speak of, is high-level code. GPT just adds a higher level :)
@seanzhang3873
@seanzhang3873 Год назад
@@KCM25NJL The low-level code I mention, means the actual codes, like python, c or c++. The higher level code I mention means the natural language which explain the fundamental logic of the program. But I do get what you mean.
@pavel9652
@pavel9652 Год назад
This is a high-level code. Languages such as Python or Ruby are amongst the least verbose. Create a code snippet once and use it later. What is the point writing essays in English? ;) Great for learning, though.
@k-c
@k-c Год назад
Pretty useful and time saving. Thanks again.
@lokesh9322
@lokesh9322 Год назад
Just 1 or 2 mins. It won't take more time to Google and copy from stack over flow.
@Mr.nice.
@Mr.nice. Год назад
Why bother using gpt or anything if i know how it works i could just write the code myself and its will be more fun than talking to a robot
@mrthompson5084
@mrthompson5084 Год назад
Won’t scrape LinkedIn correctly. Fuck I hate LinkedIn.
@jareda8943
@jareda8943 Год назад
This is exactly what I needed. Thank you!
@nfaza80
@nfaza80 Год назад
text davinci is not optimized for coding. use other model
@pleabargain
@pleabargain Год назад
Which model do you recommend?
@seanzhang3873
@seanzhang3873 Год назад
@@pleabargain codex?
@ThePyCoach
@ThePyCoach Год назад
I didn't know that. Which one do you use for coding? 🤔
@deep2mixer
@deep2mixer Год назад
@@ThePyCoach OpenAI Codex
@greengoblin9567
@greengoblin9567 Год назад
@@deep2mixer chat gpt is actually better at coding than codex.
@hariniavula5748
@hariniavula5748 Год назад
what if we have pagination or if the data is in api?
@magatsu82
@magatsu82 Год назад
isn't playground a paid service?
@Meleeman011
@Meleeman011 Год назад
wtf it can do this too? XD
@kimbapslayer1995
@kimbapslayer1995 Год назад
Lol chat gpt is not connected to the internet 😂
@happyfreeky
@happyfreeky Год назад
Isn't what you're using (playground) running GPT-3, not chatGPT? You said yourself that chatGPT doesn't have Web access (which is correct). But GPT-3 does. You keep saying chatGPT in your video (and title), which doesn't make sense. Or is playground a special version of chatGPT (aka GPT-3.5) that has Web access?
@__d.y
@__d.y Год назад
You’re just pseudo coding at that point. You’ll need to know how the code will look to ask the question in the first place. I don’t really see a point in doing this. What would be useful is dropping in html code and giving examples of what you want scraped
@knoopx
@knoopx Год назад
sorry but it's actually faster to write a scraper from scratch... try pasting the whole html and asking it to write a scraper by example instead...
@plasmaawakened4785
@plasmaawakened4785 Год назад
How would i scrape website for ad copy on there home page? 🤔
@sayednab
@sayednab Год назад
how to integrate chatgpt to email, excel and messages?
@jason_v12345
@jason_v12345 Год назад
I see zero point in using Chat GPT as merely a direct, 1-to-1 translator from natural language to code. If I need to write the instructions at the same level of abstraction as the code, I might as well just write the code. That's largely why programming languages exist! They are specialized languages that, unlike natural languages, can precisely and succinctly express low-level technical requirements in a human readable form.
@kami2496
@kami2496 Год назад
Es para ahorrar tiempo. Personalmente no he tocado HTML hace años, como electronico no me era necesario hasta el momento. Podria retomar los cursos, actualizarme, pero es otro tiempo adicional. Esto estrecha esos tiempos. Y de que maneras, refresca la memoria en los diversos lenguajes.
@satepestage3599
@satepestage3599 Год назад
I guess it's just more about batching those instructions.
@mamneo2
@mamneo2 Год назад
​@@kami2496 Hola Miguel, veo que has estado contestando varios comentarios en inglés, con respuestas en español 😂
@kami2496
@kami2496 Год назад
@@mamneo2 bueno, asumo que si hay algun problema, el traductor lo soluciona. No es una barrera necesariamente.
@spicer41282
@spicer41282 Год назад
Much like the other comment by a Lance below? This is NOT ChatGPT! You're misleading your Viewers. This is GPT-3 using the Davinci Text 3 option. Maybe you're going for more Views by saying "ChatGPT" ? But you are going to Confuse those that are Not too familiar with this technology. However, having said All of the above? Just wanted to make sure everyone knows. All in all.... This is an Excellent tutorial !!
@ThePyCoach
@ThePyCoach Год назад
I shouldn't have used the terms ChatGPT/Playground interchangeably. That said, the prompts I used generated the same code in both ChatGPT/Playground. I just decided to use the latter because it generated the code way faster than ChatGPT without wasting time explaining the code.
@Ozla102
@Ozla102 Год назад
simple version of Github copilot
@onurerdogan6725
@onurerdogan6725 Год назад
How can web scraping all pages from spesific website ?
@Nabilh17
@Nabilh17 Год назад
thank you for the video, it was interesting use cases 😊
@ttff-bd2yf
@ttff-bd2yf Год назад
I can't get it to scrape Amazon.
@13579Josiah
@13579Josiah Год назад
Waittt is the ai actually going to that link and looking at the html of the page? No right? Then how is it knowing from the url how to scrape the website?
@ThePyCoach
@ThePyCoach Год назад
In the case of books.to.scrape, it simple takes a script already built from the internet. In case on Amazon/Twitter, we're simply giving instructions on how to build the scraper. The tool never gets to see the HTML (at least not in this tutorial)
@FRANKWHITE1996
@FRANKWHITE1996 Год назад
Subscribed
@alexandermartens192
@alexandermartens192 Год назад
I dint see a use case for it in the real world. When you first need to dive into the source code to find the right divs, or scc, what is the point using ai. The code itself can be written in 5 mins manually and probably in 1 min with github copilot.
@joshsowin
@joshsowin Год назад
This is not ChatGPT, why do you keep saying that 😢
@ThePyCoach
@ThePyCoach Год назад
Yeah, I shouldn't have used the terms interchangeably. That said, you'd generate the same code with both ChatGPT/Playground. I know because I tried both. I've just made the video with Playground because it was faster.
@s.baskaravishnu22
@s.baskaravishnu22 Год назад
Many thanks
@bangaruvarun3750
@bangaruvarun3750 Год назад
Jobs are at stake!!!!
@CaterpillarOGM
@CaterpillarOGM Год назад
This' no loophole
@adamgdev
@adamgdev Год назад
Solid video!
@MagicNoThief
@MagicNoThief Год назад
Warning!! The OpenAI Playground ISN'T FREE.
@ThePyCoach
@ThePyCoach Год назад
I think they give you free credit each month. I never put any credit card info and I've used it for free so far
@MagicNoThief
@MagicNoThief Год назад
You only get some credits when you register, for a month
@MagicNoThief
@MagicNoThief Год назад
Also "Playground chatGPT" is not chatGPT. It's another language model, that's called GPT-3. ChatGPT is a GPT-3 based model that can communicate in conversations.
@whitechasp
@whitechasp Год назад
what kind of use does this have?
@kami2496
@kami2496 Год назад
Yo tenia en mente, aplicar un generador de resumenes. Imagina para un chat cualquiera, poder pedir generar un resumen de ## tiempo atras. Supongamos una conversacion que duro 30 minutos, para no leer todo, poder analizarla primero en forma de resumen, poder extraer las palabras claves, y determinar si resulta relevante revisar el contenido, o no... Asi omito el tiempo de leer comentarios de relleno. Felicitaciones, Saludos, repeticion de contenido, cosas fuera de contexto... Pero el primer paso es extraer la informacion de una pagina, luego procesarla con tecnologias NLP [GPT3], y obtenemos resultados que nos ahorran mucho tiempo....
@eggmanaegman
@eggmanaegman Год назад
Hello, I used the exactly same code but it gave a syntax error. I asked chatGPT the same question and it gave me a different code that worked.
Далее
This is How I Scrape 99% of Sites
18:27
Просмотров 80 тыс.
FATAL CHASE 😳 😳
00:19
Просмотров 709 тыс.
Witch changes monster hair color 👻🤣 #shorts
00:51
Web Scraping with ChatGPT is mind blowing 🤯
8:03
Просмотров 50 тыс.
Web Scraping with ChatGPT Mentions is Mind Blowing!
8:42
Can ChatGPT write decent CSS?
13:09
Просмотров 152 тыс.
Run your own AI (but private)
22:13
Просмотров 1,5 млн
This AI Agent can Scrape ANY WEBSITE!!!
17:44
Просмотров 57 тыс.
Automate Boring Office Tasks with ChatGPT and Python
10:06