One of the best quotes from Matthew (which I picked from some Saturday GIAC session a while back): "Pro tip: The more you know about how your tools work, the more effectively you will be able to use those tools to solve complex problems." I use this on practically weekly basis (if not more often), to explain to colleagues why they should embrace modern information tools such as Power BI
One exception I find to this maxim is a custom column made explicitly for a one off visual on a report. Full name? Sure, send it to the warehouse. Full name + project task to make a pivot table row header more useful, keep it in dax for greater flexibility. I would rather say send it as far upstream as possible, if it is to be used by more than one report. Also, anything above Concats in Dax I would say send to power query for better transformation possibilities. Just my 2 cents
This is something I'm trying to get better at. I'm newish to this world, been in data analysis for about 1.5 years. It is so easy to pull your data then do all the calculations in Power BI, but I'm trying to think ahead, where I can, or go as far upstream as I can to get what I need. I've been using this to optimize many of my earlier reports. Thanks! I've learned so much from your videos!
Love the simplicity and logic of this video. Once and for all answers the question. I’m going to share it with our power bi dev teams chat tomorrow morning.
Yes, because it doesn't matter that it's slower; it's done once and only when there's new data. When your user asks for the data it's already there, so the reports are faster.
Great to see some actual Cube Visitors! I was expecting dataflows to pop up though, which would be a common use case for org-wide dimensions like Customer or Employee Full Name.
Just started a position with a new team that is very inexperienced with Power Bi and found myself reciting this maxim. Often, when desiging a report model I find myself continuously going back into Power Query to define a new column to leverage in the model. One of those areas is the Sort by Column functionality which often won't work when trying to sort the calculated column by the column it references. So for example if I have a date column and make a short date calculated column off of it, I don't think I can then use the reference column as the Sort by Column for the short date, so it makes more sense to create the short date column in Power Query and then it can use the date column it references as Sort by Column.
Great explainer, I have a Power Automate flow that publishes Guy in a Cube videos into a Teams channel. I have been talking about this maxim regularly, but having an explainer from Matt was the ::chef's kiss::
Newbie here...is it possible to build a report that allows the user to choose which data flow table to use as source ? Instead of building/maintaining two exactly the same reports that only differ in one data flow source?
In this case, it is always better to use a view then tables, since we can transform the data within the view, rather than using power query or dax in the power bi side. Am I right? But in reality we always use data mart with a start schema rather than a view or materialized view as the source. I prefer to use one view as data source, but I don't know if one view is indeed better than multiple tables. I talked with the PowerBI team and ETL team as well. No one supports to use a view/ materialized view as the data mart for PowerBI.
I'd push back a little. As chef in the kitchen, I want my ingredients as fresh as possible, not precooked or canned. As an analyst, the source data for my report is most helpful when the minimum number of transformations have been made to the data before I begin exploring. Business rules should only be applied to the data warehouse when necessary or insights might be hidden behind assumptions.
Wouldn't it be even better instead of create the column in Power Query, to inject a SQL code while connecting to a SQL Source (maintaining Native Query)? You don't need to have access to transform the data in the source, but you can create this column on the fly while getting data, without making it using Power Query Custom Column.
@@ZappPSR Native query performance is not that good when data volume is high and you need to reference for multiple queries in Power Query. Data Virtualization Layer can be better to create all such reporting calculations. It doesn't violate rule but not the best way to maintain reusable code inside Power Query.
@@buddhigupta3557 I'm comparing making transformations in Power Query to making transformation in SQL Query inside Power Query. Either way the code will be inside Power Query, not favoring reusability. Of course the best way is to have it in the source.