Тёмный

AWESOME Excel trick to scrape data from web automatically 

Chandoo
Подписаться 626 тыс.
Просмотров 139 тыс.
50% 1

💥Check out brightdata.grsm.io/chandoo to sign up for BrightData and to automate your data collection.
Thank you BrightData for sponsoring this video 😍
~
Ever wanted to gather some data from web and use it for analysis? You can use Excel's Power Query to setup and automate web scraping easily. In this tutorial, let's look at how we can combine US state population data with Chocolate Sales data in Excel.
📗 Sample File:
=============
Practice the steps by using the Wikipedia link:
en.wikipedia.org/wiki/List_of...
Download the sample file from here: chandoo.org/wp/wp-content/upl...
⏱Video Topics:
=============
0:00 - Web Scraping data to Excel - The problem
1:04 - URL based data Extraction with Power Query
3:40 - Cleaning up the data (Transforming with PQ)
5:54 - Loading data to Excel
6:20 - The problems with Power Query method & solution
💡BRIGHTDATA:
=============
If you have a more complex data collection need, then I highly recommend using BrightData. Using their tools, you can automate data collection, clean-up and archival for all situations. If you use my link below, you get a FREE DEMO & $250 matching credit.
LINK 👉 brightdata.grsm.io/chandoo
~
MORE Power Query 💻⚡:
======================
Power Query can greatly automate & simplify your data processes. If you are new to this revolutionary technology check out below videos.
🕑🕑🕑 1+hr deep videos:
What is Power Query and how to use it with 4 Practical Examples (250k views) - • Power Query Tutorial -...
Automate Data Tasks with Power Query (10 examples) - • 10 Ways to save time &...
⏳Short but powerful videos:
How to combine data from multiple sheets with PQ (250k views) - • AWESOME Excel trick to...
Combine data from multiple FILES with PQ (150k views) - • Powerful trick to comb...
Data cleaning with Excel (10 tips) - • Data Cleaning in Excel...
🎶Play Lists:
Power Query tips & tricks - • Power Query Tutorial -...
Data cleanup & automation - • Data Cleaning in Excel...
~
#Excel #webscraping
~
A data analyst was deathly afraid of spiders. He could never web scrape.

Наука

Опубликовано:

 

23 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 116   
@jackielinde7568
@jackielinde7568 2 года назад
As someone from Arizona, I can say we definitely love chocolate here. Also, don't let kids sell boxes of chocolate, as they tend to eat more than they sell.
@chandoo_
@chandoo_ 2 года назад
Oh well, that explains the anomalies. 🤣
@tahirajabeen7175
@tahirajabeen7175 3 месяца назад
​@@chandoo_sir, can we insert pictures in excel from folder in PC, using merge column feature in power query....
@arun.kumar.s
@arun.kumar.s 2 года назад
I don't know how far will @Chandoo go to educate others, every time new things to learn. Be Awesome
@chandoo_
@chandoo_ 2 года назад
It is my mission to help one million people become awesome at their work. When I get there, I am just going to 10x it and make it my mission again.
@vijayarjunwadkar
@vijayarjunwadkar 2 года назад
Great Idea! Thanks Chandoo for sharing this! 👍
@chandoo_
@chandoo_ 2 года назад
My pleasure 😊
@Azhar_Khan383
@Azhar_Khan383 2 года назад
Thank you, sir, for keeping us up to date with time and skill development. I'm becoming awesome. Once again Thanks.
@chandoo_
@chandoo_ 2 года назад
You are welcome Azhar.
@norfolkflyingboyz2404
@norfolkflyingboyz2404 19 дней назад
Great clear video thank you. ❤
@Excelambda
@Excelambda 2 года назад
Great video!! Super cool tricks!!✌ For this particular case we can also use Data Types. For example: In H2 we write USA, with H2 selected, go to: Data tab, Data Types, Geography, select first result from side pane In I2: =H2.Subdivisions In J2: =H2.Subdivisions.Population There are discrepancies though. To visualize the differences, a formula: (Xlookup syntax technique when we deal with data types) =LET(s,pop[State or territory],cp,pop[Census population],cp-XLOOKUP(s,H2.Subdivisions.Name,H2.Subdivisions.Population))
@chandoo_
@chandoo_ 2 года назад
Great suggestion EL. While data types are a good option, PQ is more versatile and universal.
@Excelambda
@Excelambda 2 года назад
@@chandoo_ PQ rules!!✌🙏
@NealBurkard-ut1oo
@NealBurkard-ut1oo 11 месяцев назад
For this specific example, you dont actually need to use scrapping. I forget the function name but you can classify cells, one common classification is geography. Type in the state, a drop down box appears to indicate the state. New york state, usa for example. Then you can use other cells to reference the geographical cell to list characteristics like population, total area, area by land, area by water, all differnt types of discriptors pertaining to new york state. So generate all 50 states, create the columns of criteria you want, do all the referencing for the to row. Then just drop down to fill the other 49
@chrism9037
@chrism9037 2 года назад
That was awesome, thanks Chandoo!
@chandoo_
@chandoo_ 2 года назад
Glad you liked it!
@muraliiyer7850
@muraliiyer7850 2 года назад
Nice presentation and great!!
@jamesbautista2437
@jamesbautista2437 Год назад
Thank you for your video. May I ask if you can do video that automate data entry to chrome browser? Excel data to chrome browers like that without Third Party App? Thank you so much for your videos
@sharma3226
@sharma3226 Год назад
Sir could you please guide me to create excel auto update table about bestselling books rating on specific category say personal finance. 🙏🏾
@mohamedhaibe7681
@mohamedhaibe7681 2 года назад
Thank you!
@user-vg1zf8dn5m
@user-vg1zf8dn5m 2 месяца назад
Hello, merci pour la vidéo ! Il existe une méthode pour les sites qui requiert une connexion ?
@sunny4christjesus
@sunny4christjesus 2 года назад
Awesome Literally ❤️
@insidehead
@insidehead 2 года назад
Hi could you help in how to connect with oracle 10g DB?
@atlasgunther8947
@atlasgunther8947 4 месяца назад
Anyone know of a detailed video to scrape sofascore's historical score data? I currently have to scroll manually to scrape it. TIA.
@shanmugapriyayu2141
@shanmugapriyayu2141 Год назад
Can you please show videos about educational data like mark analysis or teacher analysis in power bi..
@angelvargas9042
@angelvargas9042 Месяц назад
How come when I put in the URL from other websites the tables don't pop up but they do for wikipedia? Any suggestions?
@bhavikapawaskar6381
@bhavikapawaskar6381 11 месяцев назад
I need help!! I have one application which we normally used for filling information of working employees & that application generate UNIQUE ID. 2 Step :- I have to audit those case but at the same time wanna check multiple UNIQUE ID cases with different Name via excel using filtering data. So tell me how should i audit multiple data for UNIQUE ID.!!
@mayanksharma266
@mayanksharma266 2 года назад
Sir i didn't get dataset of previous video "pivot table"
@yafethtb
@yafethtb Год назад
AWESOME! I tried to a web that cannot be scraped via requests library in Python and I can get the table from its page!
@cynthiahoz3948
@cynthiahoz3948 2 года назад
I love dark chocolate as much as I love your videos❤️
@amlevin
@amlevin 7 месяцев назад
Great vide! Is it possible to scrap if the number of pages is not fixed? And number of pages is available by link
@VisuLytics
@VisuLytics 2 года назад
Very helpful video Chandoo G
@gzfraud
@gzfraud 11 месяцев назад
Hi Chandoo .... GREAT video. QUESTION .... I scrape 10,00+ webpages so this will really help. BUT if a URL is embedded in text on a webpage, PQ or BI won't extract the URL eg email address is embedded in the person's name. I've searched and can't find it. Any ideas?
@myheliography
@myheliography Год назад
Hi Chandoo while i tried to scrape data from the website using power query the website first leads to Disclaimer (accept or reject), please guide how to skip it?
@niharraval986
@niharraval986 2 года назад
Hi Chandoo, Thanks for the informative video once again. I have one question that how do big organization deal with their data for data visualization or daily analysis. Do they use any specific tools to scrap the data from their database. They are autogenerated or one person has to do it every day manually?
@chandoo_
@chandoo_ 2 года назад
Normally big organizations have dedicated data teams which in-turn have ETL teams that do the data extraction, cleanup, automation and storage processes. They are also called Data Engineers. While these folks take care of 70% of data needs, the other 30% will fall on to the individuals/ departments for their specific needs. This is why learning a bit of Power Query, SQL, data clean-up techniques can help you in many aspects of professional life.
@karankhetwani6832
@karankhetwani6832 4 месяца назад
Can web scraping be done in macbook?
@taizoondean689
@taizoondean689 2 года назад
Thank You Sir. one request if you can create video on Excel Addins and its uses and how can we automate using that addins
@chandoo_
@chandoo_ 2 года назад
You are welcome TD. What specific add-ins you are talking about? I rarely use them for my work.
@basicinfoforall7306
@basicinfoforall7306 2 года назад
sir, i won't be able your bright data cause business id.
@rising_individuals
@rising_individuals 2 года назад
Amazing
@Vasuu05
@Vasuu05 7 месяцев назад
I tried to extract data from my Orf Intranet portal. I am not getting data table, Can you please help in this
@dlo009
@dlo009 2 года назад
Hi Chandoo, thanks for so great video and sharing your knowledge. I was wondering about your webpage though. I was wondering if you close the registering option or have you left aside that part of your business. I love the way you explain Excel and Data Analysis so thank you very much for doing it possible. Cheers
@chandoo_
@chandoo_ 2 года назад
Thanks Danilo. Feel free to email me about the courses so I can make an exception for you.
@dlo009
@dlo009 2 года назад
Dear @@chandoo_. Thanks for your quick response, I really appreciate you are very considerate. At the moment I am in an awkward position, that is because I was recently a brain surgery. Thank God it went well But one of the things i have to life is that I quite slow in learning, that is because the lack of practice, high demand of energy needed to focus in a subject and my current financial situation. That makes me believe that this is not the appropriate time to participate in one of your courses, that I will have to wait a little more. My idea in subscribing to your website is to remain in contact with you and try to follow your steps. Also think that because the lack of a way to see a proper layout or organization in the videos people post in you tube, maybe registering in your website would give me the chance to see the videos in a more organized way. I do have a bachelor in CS and lot's of excel experience but because my sickness I am literary starting from 0 (not winning, I won't be the first nor the last to be in this kind of situation). But through my career it has been the projects in which I have used excel as UI and to create reports for the middle and upper management or creating interfaces with MSSQL, ADO-CSV, MYSQL, ACCESS, ... that I had the most fun and the ones in which I can see a quick problem modeling and solution response for the client. I would love to know where to find your email, as you suggested. I have been fan of your work since a long time, years BTW but for me things have been bumpy, so I really don't remember since when I have been following you in FB. Thanks again for everything I will be watching your videos, I just found you " How I made $100k as Excel Freelancer" , which is mostly the path I want to follow. Cheers.
@vishnu8899
@vishnu8899 2 года назад
Sir how did you create that awesome chart at 6:11
@eddardstark3272
@eddardstark3272 9 месяцев назад
What version of excel is this?
@thetravelservice1235
@thetravelservice1235 Год назад
Can you help me to scrape Skyscanner prices into google sheets.
@offsonicstreams
@offsonicstreams 9 месяцев назад
Can we doing that some sites like indian railways websites where data can be accessed after log in
@chandoo_
@chandoo_ 9 месяцев назад
Probably with the brigthdata tool, but not with Power Query as of now.
@eversut1
@eversut1 Год назад
I want to import data from a website site that says "your browser is not up to date". I use Office 2016 and I can import data from other websites. Even if I change the web browser option, I still can't import data and receive the same error messega. How can I fix the problem with the website I mentioned above. Thanks and regards.
@chandoo_
@chandoo_ Год назад
I am not sure how to fix the problem here. Try posting it in Microsoft forums.
@songs-os4ci
@songs-os4ci 5 месяцев назад
mine is appearing as "DataSource.Error: The request was aborted: Could not create SSL/TLS secure channel."...What problem?
@yapyh2872
@yapyh2872 8 месяцев назад
Hi, how to scrape data that is not inside a table?
@amararora4647
@amararora4647 2 года назад
In my previous laptop I use to press CTRL +shift +page up (arrow keys) to shift between sheets, in New laptop its CTRL +fn +page up (arrow keys) Can I change it to old method in new laptop?
@chandoo_
@chandoo_ 2 года назад
Hmm.. not sure Amar. I suggest checking with your laptop provider.
@basicinfoforall7306
@basicinfoforall7306 2 года назад
cool! bro...
@vinay94095
@vinay94095 2 года назад
Hi, I have a query...can we paste a small table of 3 rowa & 3 column in a cell in excel
@chandoo_
@chandoo_ 2 года назад
You can. Just double click on the cell and paste.
@Amy-ej2px
@Amy-ej2px Год назад
I honestly would have just copied and pasted the table straight from the site bc I'd be concerned that the data wouldn't import correctly. I thought that you were going to point that out as being one of the issues. Is the technology behind it so good that the tables come through just as they are on the screen? What are the benefits of using this over a normal copy/paste?
@noedits5543
@noedits5543 Год назад
in this method, we can periodically pull the data eg every minute or even every 10 seconds!!!
@renuka2740
@renuka2740 10 месяцев назад
how to overcome the restriction of 100 rows extraction only?
@vikasinimicky4112
@vikasinimicky4112 2 года назад
Hi, I have small doubt, I want to get to data analytics and ultimately after couple of years want to be a data scientist. Is this possible or am I planning wrong!! please help me out provide some information thank you
@chandoo_
@chandoo_ 2 года назад
You can certainly do that. I suggest talking to data scientists in your network to get an idea of the kind of work they do and start picking up the skills slowly.
@wajahatsaeed9602
@wajahatsaeed9602 2 года назад
How to extract report from Oracle in Excel ?
@lanieticman1807
@lanieticman1807 Год назад
Hi, Chandoo thanks for the incredible video, I just had a problem when I click the Data tab>Get Data> From Web.. what I saw is different from yours. Mine shows "NEW WEB QUERY and a bunch of script errors with the question box "Do you want to continue running scripts on this page? Yes/No " while yours is "From WEB" and with choices Basic or Advanced. What's the problem with my Excel don't you think? Thank you
@chandoo_
@chandoo_ Год назад
That depends on the webpage you are connecting to.
@lanieticman1807
@lanieticman1807 Год назад
@@chandoo_ Thanks so much for your response. How can I make it the same like you my web seems msn.. don't know how to change it
@dilip.chityala
@dilip.chityala 2 года назад
How you're knowing exactly what iam expecting each time @chandoo... I need share point folder name and creation date from others one drive
@SpiritedTravellerr
@SpiritedTravellerr 2 года назад
Hello sir I have a serious questions That is from which playlist i have to start and which step by step playlist I have to follow? From ur all playlist
@chandoo_
@chandoo_ 2 года назад
Hi Musafir, Thanks for your question. I suggest using the FREE Excel course playlist. As my videos cover a wide range of topics and have been made over 10+ years of uploads, it is impossible to find a thread that connects them all. If you want a step-by-step course without any distractions or ads, just join my Excel School program. Visit chandoo.org/wp/excel-school-program/
@SpiritedTravellerr
@SpiritedTravellerr 2 года назад
@@chandoo_ thank you sir
@rams6635
@rams6635 Год назад
How to scrap from log in needed website
@prithvisahay269
@prithvisahay269 2 года назад
Sir Please Tell how to Hide Row Border like you always do. Please
@vishalbhati912
@vishalbhati912 2 года назад
Go in view tab and uncheck the gridlines.
@sirisoj
@sirisoj 2 года назад
Whenever I change the name of the table in PowerQuery when the data changes and I go to Refresh an error appears stating that the table with the name I created was not found in the source. 😐
@chandoo_
@chandoo_ 2 года назад
Refer to the error message. Power Query doesn't like name changes for underlying tables or columns. You can redesign your data clean-up steps so that they don't depend on the names. But you must learn a bit more M language to make it smooth. See this video for more on Power Query - ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-PiFAa_jjaEI.html
@MirGlobalAcademy
@MirGlobalAcademy 6 месяцев назад
Bright Data site is not working
@RAVIKUMAR-nx3od
@RAVIKUMAR-nx3od Год назад
If the data will change for example author of the page will add more information on that webpage my question is :- our data will be updated automatically inside Excel or no ?
@chandoo_
@chandoo_ Год назад
It should be as long as the underlying structure of the webpage is maintained (ie they used the same CSS class or table ID or something that you used in PQ).
@RAVIKUMAR-nx3od
@RAVIKUMAR-nx3od Год назад
Thanks for the quick response 😊
@venkatrooney9912
@venkatrooney9912 2 года назад
Hi bro , can we get ? How to remove duplicate words in single cell .
@ajpw7695
@ajpw7695 Год назад
Not sure of a quick way to do this but you could: 1. Use text to column with space set as the delimiter to create a row of cells, each containing one word 2. Cut and transpose paste the cells 3. Remove duplicates 4. Concatenate the cells using TEXTJOIN with space as the delimiter
@kasanibhanuvenkat9339
@kasanibhanuvenkat9339 Год назад
I just have one question, whenever there is a change in data on the website, does the data in excel also updates?
@larriemayodi2085
@larriemayodi2085 Год назад
yes, it should
@kasanibhanuvenkat9339
@kasanibhanuvenkat9339 Год назад
@Chandoo
@JohnPaulIghorue
@JohnPaulIghorue Год назад
yes, cos the url was copied and used and the data will automatically refresh and be included, as long as the URL remains valid.
@lydiasaraswati517
@lydiasaraswati517 Год назад
how to add "box sold in 2021?"
@koshishlamsal8703
@koshishlamsal8703 2 года назад
How to extract table data into excel from web with multiple pages ...please make a video on it ...thank u in advance 😁
@chandoo_
@chandoo_ 2 года назад
Good idea. Do you have any examples of such websites? Often multi-page sites block PQ based access in my experience.
@JJ_TheGreat
@JJ_TheGreat 2 года назад
@@chandoo_ How about writing in multiple URLs in an Excel table and creating a function in PQ to scrape them!
@chandoo_
@chandoo_ 2 года назад
Yes. we can do that. If the URL has a pagenum parameter, we can also parameterize it with PQ functions.
@ducnguyenhong3173
@ducnguyenhong3173 2 года назад
Sorry for asking. This Excel is for Mac or Windows? I couldn't find the location of "Import from Web"
@chandoo_
@chandoo_ 2 года назад
I am using Excel for windows. I don't think Mac Excel's PQ has Import from web yet.
@ducnguyenhong3173
@ducnguyenhong3173 2 года назад
@@chandoo_ thanks for ur answer so much. On google sheet just type "import html" right?
@MChehab100
@MChehab100 2 года назад
What about sites that require user name and passwords?
@JJ_TheGreat
@JJ_TheGreat 2 года назад
That's just another type of authorization - on that screen he showed.
@viniciuscastro3137
@viniciuscastro3137 2 года назад
👏🏻👏🏻👏🏻
@shab1467
@shab1467 2 года назад
I just started following your channel to level up my excel skills. But I am not sure where to begin with, as there are huge number of videos on your channel. Can you share some tips or videos from your channel?
@chandoo_
@chandoo_ 2 года назад
Welcome aboard Shab. I suggest watching the FREE Excel course videos first. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-F7aPazuS8QY.html Then go for the videos in Excel for Data Analysis playlist. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-v2oNWja7M2E.html If you need a step-by-step course, I suggest going for Excel School program - chandoo.org/wp/excel-school-program/
@shab1467
@shab1467 2 года назад
@@chandoo_ Thank you for your response. I will definitely check out the courses and videos you recommended
@JohnPaulIghorue
@JohnPaulIghorue Год назад
@@chandoo_ Thanks so much! I was wondering why I was not receiving alerts of your updates, then saw that in all my learning I failed to click the subscriv=be button. Please forgive me. Now done!!! ..and thanks for this learning pathway!
@monuthakur7649
@monuthakur7649 2 года назад
I don't understand where boxes sold came
@chandoo_
@chandoo_ 2 года назад
We already have that data. It can be any internal data.
@TSSC
@TSSC 2 года назад
Great, but using the geography data type would be a strong contender in a real-life situation.
@chandoo_
@chandoo_ 2 года назад
That is a good alternative too. As data types are less widely available than Power Query, I chose the later option. Plus, it aligns with the sponsor for the video too. 😀
@TSSC
@TSSC 2 года назад
@@chandoo_ Certainly. Just wanted to contribute with that option for viewers that had forgotten about (or were unaware of) the geography data type.
@TSSC
@TSSC 2 года назад
@@chandoo_ And congratulations to reaching 250k subscribers. There’s a good reason for the number going up.
@asifjames112
@asifjames112 Год назад
Wooo 999+1=1000 likes🤘🤘
@moneymachine100
@moneymachine100 2 года назад
LOL have ya seen the pricing $1000 - $2000 a month OMG! lol Yer where do i sign up quick quick NOT! Get Real mate.
@chandoo_
@chandoo_ 2 года назад
Just because you haven't found their service of value doesn't mean others won't. Hundreds of companies and businesses use their product all the time.
@BrightData
@BrightData 2 года назад
Our pricing packages reflect the amount of data a customer needs to collect on a monthly basis. If you are looking to collect smaller amounts of data we can support you as well. Set up a call with a sales rep who will gladly help you find the package that is best for you and your business.
@Funkteon
@Funkteon Год назад
I'm sick of seeing the exact same example on every one of these Excel web scraping videos. It's ALWAYS "Here's how to scrape this table from Wikipedia into your worksheet", it's never how to scrape a single data point from a webpage such as a dynamically changing number or name etc that can be identified with an isolated XPath...
@chandoo_
@chandoo_ Год назад
Why not search a bit more or better still, build one yourself? How are you sick of free help and guidance offered by someone you barely know? Here are few other web scraping videos - ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-VRLxcN_w-rg.html ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-RU9D-CyKwEQ.html
@Pooty_With_A_Fat_Booty
@Pooty_With_A_Fat_Booty 4 месяца назад
I noticed the same thing. It's always from Wikipedia or other source easy to extract. 😂
Далее
Strong cat !! 😱😱
00:19
Просмотров 2,6 млн
Web Scraping with ChatGPT is mind blowing 🤯
8:03
Просмотров 39 тыс.
This ~NEW~ Excel Function is Shockingly Powerful!
9:37
10x your productivity with these AI tools in Excel 😲
18:42
I don't use VLOOKUP anymore. I use this instead....
10:25
Always Check for the Hidden API when Web Scraping
11:50
How to Soldering wire in Factory ?
0:10
Просмотров 2,2 млн
Aura 879dsp новинка и хит
0:48
Просмотров 168 тыс.