Querying 100 Billion Rows using SQL, 7 TB in a single table

Подписаться 1,5 тыс.

Просмотров 49 тыс.

50% 1

0:00 Introduction
0:59 The data
1:22 1K row Query
3:53 100K row Query
4:32 10M row Query
5:20 1B row Query
6:04 100B row Query
8:03 Query Costs
8:45 Conclusion

Наука

Опубликовано:

6 июл 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 33

@alok5253 3 года назад

Simple and concise, thank you!

@PradeepMishra-qs2hz Год назад

Awesome . Keep it up.

@WanderWisdom731 2 года назад

Wow.. this experiment was really amazing to benchmark the bigquery .

@vipulkumar7938 3 года назад

Well Explained, Thanks a lot

@prathivenkatasaipavan9909 2 года назад

Great explanation

@mathteacher5670 Год назад

excellent sir thank you so much highly motivational for passionate person

@ungeedh 3 года назад

Nicely explained.

@vaibhavis1 2 года назад

Thanks for the explaination. I am curious that is it just scaling of the systems, or BigQuery does query optimization to reduce the latency as well?

@merhaiakshay9625 3 года назад

Please organize the videos and make playlists , great video , very informative and helpful, which led me to subscribe , thanks 😊

@ashitoshthakur9402 2 года назад

Wow what a gr8 video sir ji..pls sir make video on sql with ml and sql also..

@arthurrodrigues5382 Год назад

Amazing!

@himanish2006 2 года назад

This is good...

@AamirKhan-vu2om 2 года назад

Heyy, very informative. I came here around searching for big data processing in seconds. Ive a question, I would like to build a system where I import terabytes of data into single table with keys and I want to perform all the DML operatiom in such a way it should take very less execution time as shown. Please help me out, how I can acheive. Im stuck.

@muhamadridwan4766 Год назад

wow!

@TheElementFive Год назад

The first question you should always ask when working with a 100 billion row database: “Why do I have a 100 billion row database?”

@davidlean8674 11 месяцев назад

And the answer would be "because I work with a multinational enterprise customer". If you have a large market share in China (1 bill people) , India (1 Bill people), Europe 0.75 Bill, USA (350M people) it doesn't take long to get to 100 BIllion transactions. If you want to do Financial Year on Year comparisons, you need to keep at least 24 months of data, usually 36 months. .

@skill-learning 2 года назад

I appreciate your effort. Could you put the used link for the google cloud project?

@sconnell194 3 года назад

👍

@houssem25000 27 дней назад

So I don't have to carry about performance when I make projects ?!

@Hrzzz1 7 месяцев назад

we can download this database to do some testes ? I nice ideal for next video is compare this same situation with noSQL database.

@Mju98 4 месяца назад

Hello sir. I tried to import 400k data into big query sandbox. But ended with more errors. Is this possible to import those data. Pls anyone help me it's urgent ( interview assignment)

@aminremiiii Год назад

Please for 50 days I am looking for this i wanna to create 2000 users in mysql and set the phone number as user name and password my be say me how can i create most users with default password? That's

@13990 4 месяца назад

Nice video but while voicing better to expand the screen than side by side videos

@visva2005 2 года назад

@Arpit Agrawal, Good. Let me know what database is behind this Console?

@elastiq-ai 2 года назад

Google Cloud Bigquery 😁

@MDDM03 Год назад

marketer of google cloud.. nothing states what to improve

@nfacundot 10 месяцев назад

Hello, can I connect it on php?

@MdRakib-rc6ub 2 года назад

I need your help

@Helloimtheshiieet Год назад

Im confused were these indexes?

@elastiq-ai Год назад

BigQuery doesn't have indexes. It has partitions and clustering.

@davidlean8674 11 месяцев назад

This is nice but not that impressive. Obviously, the table is being stored using Columnstore Compression techniques. So you only need to query the columns in the select list. And they are typically grouped in blocks of 1 M or more. These header pages keep rowcount values. So you are not reading every row. Just the block headers of a single column. If your query forced the scan of all rows in the "block" asking it to be combined with other fields in the same row or in other tables before you could filter it. You will no longer be in the columnstore sweet spot. and the difference in query speed would be more striking. Still good thou, as that is a common use case.

@toxiclife Год назад

what to do when I want to overwrite 100 millions of rows into new table, in minutes? df.write.mode("overwrite").saveAsTable("FINAL"), if you could please help with this?