Тёмный

Strategies for optimizing your BigQuery queries 

Google Cloud Tech
Подписаться 1,2 млн
Просмотров 62 тыс.
50% 1

Опубликовано:

 

16 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 25   
@AlesNajmann
@AlesNajmann Год назад
This is the simply most important video to watch by the users of BigQuery 🔥
@premmohantyagi
@premmohantyagi 3 года назад
Perfect and easy examples to describe the impact of optimization.
@divertechnology
@divertechnology 2 года назад
great video i thought it gonna be light but is very deep in knowledge
@michelkant
@michelkant 3 года назад
Nice one! At how much data processing do you recommend to start working on optimizing the queries? As in when will the effect really start to show an impact? 1gb of processing, 10, 1000 or 1tb?
@leighajarett221
@leighajarett221 3 года назад
That's a great question! I think it depends on your pricing model. For example, if you're paying on-demand for each query you may be inclined to start optimizing queries that are scanning whatever #GB translates to a reasonable cost savings. This will also depend on how often the query is being run - for example if a query scanning 1TB of data is only run once that's $5, but if you run it everyday that's a lot! I would try to pinpoint queries that are run on a regular basis and are consuming the most resources, but set some threshold based on the total estimated ROI. For example, (the number of times the query will be run each year) x (the total cost of the query) > $1000 (or whatever is significant to your business) Check out the pricing calculator if you're estimating query prices: cloud.google.com/products/calculator/
@xiaoyangshawnhuang1251
@xiaoyangshawnhuang1251 2 года назад
thanks for the great explanation. I like "stay curious"! :)
@sevanajemian
@sevanajemian 3 года назад
amazing video for our new joiners :))
@abhishekjena9581
@abhishekjena9581 2 года назад
nice way of presentation .....good stuff
@majorcemp3612
@majorcemp3612 3 года назад
How do you know the "where" clause is performed before the join (5:17) ? Don't you have to do a sub query on the left joined (t2) table ? (also does this sub query be less performant ?)
@leighajarett221
@leighajarett221 3 года назад
You should be able to tell from the execution details :) In the first stage of the query, the workers will read from distributed storage and filter the data before shuffling and joining (which usually happens in the next stage). You can see the WHERE clause and the JOIN applied in the details of the stage to understand the order. You don't need the where clause in a subquery to filter the data before joining, but subqueries can be helpful if you want to pre-aggregate the data before joining to reduce the amount of data joined.
@majorcemp3612
@majorcemp3612 3 года назад
@@leighajarett221 so I don't need a where clause in subquery because everytime the where clause is being done at the distributed level ? Like everytime ? Or should I put a subquery to be sure ? (and if I put this subquery to be sure will I get bader performances ?)
@dheer211
@dheer211 3 года назад
Great video thanks Leigha
@jaysyanshar
@jaysyanshar 3 года назад
is the concept can also be applied to other SQL databases?
@yellowsubmarine7059
@yellowsubmarine7059 Год назад
Some these concepts could apply to other SQL database management systems (DBMS). Also some DBMS' are smart enough to apply some of these optimizations when generating the execution plan without having the user making changes to their SQL query.
@SasawatChanate
@SasawatChanate Год назад
5:50 Does the common practice of putting WHERE 1=1 as the first expression has any affect on this?
@Alex315101
@Alex315101 Год назад
really curious why such simple examples can not be automatically handled by query optimizer (which is already done by other databases)
@jackylee6839
@jackylee6839 3 года назад
At 5:25, why does clustering based on common join keys improve the efficiency more than partitioning?
@samhithnamburu
@samhithnamburu 2 года назад
Clustering on join keys leads to least number of shuffles to perform the join
@MyBestHibou
@MyBestHibou Месяц назад
A little surprise how little the default optimizer can do for us...
@ahmedhelal174
@ahmedhelal174 2 года назад
Thank you
@googlecloudtech
@googlecloudtech 2 года назад
You're welcome
@Code-ie8ub
@Code-ie8ub 3 года назад
thanks!
@gangsterrapmix8570
@gangsterrapmix8570 3 года назад
Google Google many businesses are always liked by people all over the world Thank you Google
@viacomsspy5439
@viacomsspy5439 3 года назад
Recommended is broken
@hritvikmohan4048
@hritvikmohan4048 3 года назад
Maal ho yrr tum toh....
Далее
Cloud cost management with Looker
8:20
Просмотров 10 тыс.
How does BigQuery store data?
8:20
Просмотров 50 тыс.
Understanding BigQuery data governance
8:35
Просмотров 26 тыс.
5 Ways to Improve Your SQL Queries
10:04
Просмотров 93 тыс.
Partitioning and Clustering with BigQuery
4:47
Просмотров 67 тыс.
7 Database Design Mistakes to Avoid (With Solutions)
11:29
What does a Data Analyst actually do? (in 2024) Q&A
14:27