Тёмный

3 ways to REDUCE DATA In Power BI Desktop (No Premium needed!) 

Подписаться
Просмотров 55 тыс.
% 1 155

Sometimes you want to reduce data in Power BI Desktop so it's easier to work with. Then work with all the data in the Power BI Service. Patrick shows you 3 ways that can help do this without using Incremental Refresh or needing Power BI Premium!
📢 Become a member: guyinacu.be/membership
*******************
Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.
🎓 Guy in a Cube courses: guyinacu.be/courses
*******************
LET'S CONNECT!
*******************
-- guyinacube
-- awsaxton
-- patrickdba
-- guyinacube
-- guyinacube
-- guyinacube.com
***Gear***
🛠 Check out my Tools page - guyinacube.com/tools/
#PowerBI #ReduceData #GuyInACube

Наука

Опубликовано:

 

24 мар 2021

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 65   
@hatimali4945
@hatimali4945 3 года назад
Your videos has really helped me grow in power bi practice in my organisation.. thank you so much😀
@dinoben9527
@dinoben9527 3 года назад
I have been using date parameters for a proof of concept for my client and it works very well. It allowed me to avoid technical limitations of the relational database in use at the organization. Once in the service, all the data was pulled. I didn't know there was a simple way to limit the number of rows directly via Power Query. I'll test that now. Thank you for the useful information.
@SlothyUK
@SlothyUK 3 года назад
Thanks so much... spent so much time working in editor making small changes and waiting for the changes to apply on 6MM rows... cheers
@apekshaawaji185
@apekshaawaji185 3 года назад
Thank you so much Patrick😀 This is really helpful! Would love to see videos on customer attrition analysis. Cheers👍
@vpwl
@vpwl 3 года назад
Another amazing video! Great job Patrick! I always wondered how models with large data sets were created and published without a powerful machine.
@TitusRex
@TitusRex 3 года назад
I was using exactly this approach (parameter), I'm glad I'm not the only one.
@miguelguillenpaz
@miguelguillenpaz 7 месяцев назад
Hi Patric, thank you so mucho. I was looking for this information before
@intelabhinav
@intelabhinav 3 года назад
I love the fact that you can select the table that you want to load.. as in the one which is filtered or not filtered... I am using this feature for testing putt he load with fewer rows.
@otakarveleba7202
@otakarveleba7202 3 года назад
Hi Patric, great video as always. I totally forgot about query folding, as I'm usually making custom SQL queries that PBI can't fold anyway... so I'm usually injecting my parameter into SQL query (where parameter) directly. Unfortunately, this approach makes SQL query quite badly human-readable (adding lots of code like #(lf) instead of new line etc.). Thank's for bringing up again the possibility to fold our queries in an intuitive way.
@ayouberrarhbi8912
@ayouberrarhbi8912 3 года назад
i was looking for this trick for a long time. Hours and hours wasted waiting for refresh. Thank youu !
@AlbertoTrujillo
@AlbertoTrujillo 3 года назад
Great Video Patrick!
@sumitchoubey3859
@sumitchoubey3859 3 года назад
Informative session
@sands7779
@sands7779 3 года назад
Really useful tips that will help people working from home and working with slow broadband through vpn develop and deploy reports and take less time waiting for rows to load, thanks Patrick 😊
@saikirankusumanchi4522
@saikirankusumanchi4522 3 года назад
this helps. thanks for this trick
@pabeader1941
@pabeader1941 3 года назад
OOOOOHHH! If only I was allowed to use the service!!! So many things would be possible. Great content and I will keep it in mind if I ever get them to change their minds.
@felipefrancisco01
@felipefrancisco01 3 года назад
that's awesome, man.
@singhraunak09
@singhraunak09 3 года назад
Wow superb video ❤️
@KateMikhailova
@KateMikhailova Год назад
Thanks for great idea (2-nd way with parameter)! I can't use it "as is" because Report Server (on-premises) haven't parameter's option for scheduled refresh. But (as always) xlsx-file with number of rows in network folder works well :) According to Matthew's "upstream" maxima I included number of rows from xlsx-file in SELECT-section of query. Now I know that "first 100 rows" has different translation for MS SQL-source and for Oracle-source. Interesting experience!
@naveenkumarm.r4692
@naveenkumarm.r4692 3 года назад
Hi Patrik, I need a solution for monitoring scheduled refresh from multiple workspaces.. can you please make a video on this...
@pinstripeowl
@pinstripeowl 2 года назад
thank you for this - I'm going to try and set up some refresh parameters now on the report I'm building because it links to Salesforce but I only need to see one of the territories/divisions so hopefully, hopefully, this will help reduce the thousands of lines it pulls through!
@pinstripeowl
@pinstripeowl 2 года назад
it worked! Thank goodness, that's definitely reduced the strain on this(!)
@aliramadan7425
@aliramadan7425 3 года назад
Hello Patrick. Interesting stuff. Do you have any advice on optimizing performance between Power BI Desktop and AWS Athena using ODBC connection? Would the reduce rows option you showed be a good option in case of 150+ million rows? Appreciate your feedback.
@vpwl
@vpwl 3 года назад
Question- what if my model includes multiple tables and data sets, would I need to create separate parameters for each table? Will I be able to adjust them all in the service?
@watkinson8230
@watkinson8230 3 года назад
Hey Patrick does this work if you are appending multiple CSV's and then various transformations i.e. would the filter on the final table be enough to limit the refresh or do I need to do it in each source import?
@samiphoenix5268
@samiphoenix5268 3 года назад
Another easy way to reduce dataset size without reducing number of rows is to convert decimal values to whole numbers (assuming you don't need the additional accuracy)
@Acheiropoietos
@Acheiropoietos 3 года назад
Yes, I discovered this recently - who needs pennies with a bottom line in the millions?
@frankcook5884
@frankcook5884 3 года назад
@@Acheiropoietos Finance
@jaypollock78
@jaypollock78 3 года назад
Definitely! I was able to cut my dataset size down by 40% by doing this.
@tymccurdy8932
@tymccurdy8932 2 года назад
Done something similar with the parameter values. Instead, we set up two dataflow entities to work off of, a full set of the data, and a sample set of the data. Names Source & Source_Sample. Then we can just parameterize the source. So start with the parameter point to Source_Sample, when it's all done and published - update parameter to Source. I think this way, as it can standardize the process for all report builders in our company and we can ensure they're getting a good distribution of the data within the Sample. Report writers just toss in a couple standard lines in Advanced Editor (copy/paste) and they're all set up. We don't need to rely on them to find janky ways to try to get their own sample set.
@premjitchowdhury262
@premjitchowdhury262 Год назад
This channel is making power bi no1 in the world.
@runexn5940
@runexn5940 4 месяца назад
Q: I had a question regarding using the first tip with incremental refresh. The table I want to implement incremental refresh on takes too long to load data on my desktop file so I am using an sp to load only a small subset of the data on my desktop file and then after implementing incremental refresh on it I publish it and then make changes to my sp to bring in all the data and refresh the data on service. This method worked before without incremental refresh but now refresh the initial refresh fails after some 8-10h and I am not sure what the exact cause may be. Is it because I am using an sp with incremental refresh? Or I have to load all the data in the file before publish it for thw initial refresh to work fine?
@getusama
@getusama 2 года назад
That is some great advice - thank you. I was also wondering in the third way using start and end dates; are you able to dynamically set the values to be current date for end date and say the date starting from the last 3 months? This is so that once you are done with uploading the report and you have set the schedule the start and end date parameter in power bi service could always use the current and last date (which will be different every day).
@ruelformales5184
@ruelformales5184 8 месяцев назад
I built upon your question, because you are correct you need the last updated data. 1. Create a parameter as "Date" which will act as your start date (don't worry about the end date) 2. On your transformation, look for a date column and choose "Is After" and use your Parameter Date. - let's say put a date 3 months ago 3. Publish your PBIX, then change your parameter date as far as you need( e.g. 5 years ago) 4. Let the heavy lifting happen on Powerbi service - you should get the latest data from 5 years till today
@marounsader318
@marounsader318 3 года назад
hi matt, are you running the refresh on a laptop or server, because the refresh of the 500k rows is pretty fast? if it on laptopn can you please share the specs, as even 100k rows its takes alot of time on my working laptop
@akhileshgupta1971
@akhileshgupta1971 2 года назад
Patrick - plz start a tutorial series step by step video to reach the goal as pro on Power BI. I have checked all through your uploaded videos and unable to find out the tutorials series on Power Bi. if you have all modules published via through one can be professional then plz arrange the module in order like ascending to descending to be a pro.
@Carito_k
@Carito_k 6 месяцев назад
If I make a parameter for selecting only 100 rows does it affect to my final result?
@meeradominic9094
@meeradominic9094 3 года назад
Hi, I'm looking for changing an existing report from odbc to powerbi. The source is postgresql. After I installed the connector in my local machine, I could successfully connect to DB by changing properties in advanced editor of the query. But when I published the report to the Service, it is not working. Could you help with it?
@ronvdlaan1
@ronvdlaan1 3 года назад
Hallo Patrick, i set this up and it words fine with a value in the parameter, but when i set the parameter to 0 then all my previous steps in the Query Desiner are lost. What cab be my problem?
@arnaudcampestre
@arnaudcampestre Год назад
Would this first method help to publish a pbix file that is above 1gb?
@lestatelvampir69
@lestatelvampir69 3 года назад
How would you configure the editor when the source is not a single file but a folder? I can't see how to do it :(
@haydehkaveh3047
@haydehkaveh3047 3 года назад
I have done the parameter adjusted to have 100 rows, but on the Cloud service does not let me to refresh because of credential pass. any thought?
@muralichiyan
@muralichiyan Год назад
Nice, but end user wants to download normal export without reduce data how to do that
@marymurphy9014
@marymurphy9014 Год назад
can the parameter(s) be a list of 500 or so employee ids? The only thing I have to limit the 6M+ record pull is the employee id or employee email. I don't need all 125,000 employees - just our little 500+ person group.
@sayelichakraborty7222
@sayelichakraborty7222 Год назад
Hi Patric, I Am having a weird error, When I put NumberOf Rows in Advance Query Editor.. and click ok, it shows an error " The import NumberOfRows matches no exports. Did you miss a module reference?" Please Help
@mcnater
@mcnater 3 года назад
Big problem with 2 and 3 at our org is that we have millions of rows and those queries run first before it can get to those steps limiting the data. So it's still a huge pain on desktop to sit there and wait. We also don't have access to views so that's out the window. I have made some parameters that will switch between "ROWNUM < 100" or "TOP 100" depending on our server and that works OK, but it's still kind of a pain.
@akhilannan
@akhilannan 3 года назад
I think query folding should work here, which should push back the filter directly to the source.
@mcnater
@mcnater 3 года назад
@@akhilannan most of our code is done with custom SQL from the start so I don't think so in that case.
@HenriqueAzvdo
@HenriqueAzvdo Год назад
What did you do? I have the same problem, I limit it by date (start date and end date) and first it loads the entire query, which takes hours and hours
@cowpoke98
@cowpoke98 10 месяцев назад
Bingo, the video is a little misleading because it says "limit the # of rows being brought back from the data source". But thats not what these are doing, its applying the filter AFTER all of that data is queried. Every time you modify a table or anything and "Apply changes" you have to sit through all that data being brought in again. There needs to be a simple way to limit the # brought over on import.
@mcnater
@mcnater 10 месяцев назад
100% correct@@cowpoke98
@ahmadbadalov8489
@ahmadbadalov8489 Год назад
This didn't work for me, because I was calculating last 30 days, filtered from 1 jan 23, only on 1 jan 23 doesn't show correct on the service.
@pankajsonawane3958
@pankajsonawane3958 3 года назад
Hi, Patrick huge fan of yours.. I am stuck with a time Slicer. I have static time column which has values from 12:00:00 AM to 12:59:59 PM in the time table. I want to create a time slicer that will look like a range slicer. If I do that using between option in the sclicer I am not getting time values to select in the boxes I see date values there. My goal is to filter out the page by a time range. Please help!!!!!!!!!
@haydehkaveh3047
@haydehkaveh3047 3 года назад
Even the credential password now is working, modifying the parameter does not still refresh my report.
@LandscapeInMotion
@LandscapeInMotion 3 года назад
Want about joining tables with millions of rows? Performance? Normally need SSAS..
@GuyInACube
@GuyInACube 3 года назад
SSAS Tabular is the same VertiPaq engine as Power BI. It works :)
@GurpreetSingh-jg9ll
@GurpreetSingh-jg9ll 3 года назад
👌🏼👍🏼
@Kaiyening
@Kaiyening 3 года назад
Dat Power BI yellow Apple Watch strap is glorious! 😆 Also I’m anxious to upvote this video as it’s 666 likes now ._.
@rahulmaurya9444
@rahulmaurya9444 3 года назад
Can we apply custom last one month filter in advance editor
@Bharath_PBI
@Bharath_PBI 3 года назад
Yes
@CuddleStories
@CuddleStories 3 года назад
Will Smith knows powerbi?!?
@AbouAli01006300091
@AbouAli01006300091 Год назад
The 3rd method is the most interesting "BUT" if I want the complete period (5 years for example) but with a reduced dataset. The workaround I used was: 1. sort the dataset using the date column 2. create an index column 3. add a helper column in which I divide the index values using Number.Mod (lets say over 10 which means I need to keep only one row out of every 10 rows) 4. filter the helper column to keep 0 values only BINGO: I reduced my data to 10% sample of the total volume and kept all the date range Advantage is: having sample data across the entire time Disadvantage is: step no.1 above means the full dataset is loaded in first step :( I hope if someone can achieve the same goal without loading the full dataset.