5 million + random rows in less than 100 seconds using SQL

Подписаться 125 тыс.

Просмотров 18 тыс.

50% 1

System Design for SDE-2 and above: arpitbhayani.m...
System Design for Beginners: arpitbhayani.m...
Redis Internals: arpitbhayani.m...
Build Your Own Redis / DNS / BitTorrent / SQLite - with CodeCrafters.
Sign up and get 40% off - app.codecrafte...
In the video, I demonstrated how to efficiently generate 5 million rows of random data for a taxonomy structure in SQL. We focused on creating categories, subcategories, and topics using just SQL commands and leveraging joins for data amplification. By utilizing a helper table named counters and strategically using insert statements with select queries, we efficiently populated the database with the required data. The process involved careful structuring and utilization of joins for data multiplication, resulting in the successful generation of 5 million rows in under 100 seconds.
Recommended videos and playlists
If you liked this video, you will find the following videos and playlists helpful
System Design: • PostgreSQL connection ...
Designing Microservices: • Advantages of adopting...
Database Engineering: • How nested loop, hash,...
Concurrency In-depth: • How to write efficient...
Research paper dissections: • The Google File System...
Outage Dissections: • Dissecting GitHub Outa...
Hash Table Internals: • Internal Structure of ...
Bittorrent Internals: • Introduction to BitTor...
Things you will find amusing
Knowledge Base: arpitbhayani.m...
Bookshelf: arpitbhayani.m...
Papershelf: arpitbhayani.m...
Other socials
I keep writing and sharing my practical experience and learnings every day, so if you resonate then follow along. I keep it no fluff.
LinkedIn: / arpitbhayani
Twitter: / arpit_bhayani
Weekly Newsletter: arpit.substack...
Thank you for watching and supporting! it means a ton.
I am on a mission to bring out the best engineering stories from around the world and make you all fall in
love with engineering. If you resonate with this then follow along, I always keep it no-fluff.

Опубликовано:

11 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 25

@abhishekvishwakarma9045 3 года назад

This is something new for me I just learned sql for interviews 😅 but practically I mostly used nosql database one request if possible create one application build which uses SQL as database that Will fun and interesting as your teaching is super clear sir🔥🔥

@lehung-up8jv 5 месяцев назад

Thanks you so much. nice and simple solution

@hc90919 Год назад

Arpit, your videos are awesome. I have been seeing your videos from the last one week, I liked all your content. Curious how did you develop so much knowledge on every topic (db, hld, micro services )?

@AsliEngineering Год назад

I worked 12 hours a day for last 10 years. Be it personal projects or office work. No shortcuts.

@hc90919 Год назад

@@AsliEngineering - Got it. How do you retain that knowledge. Let's say you study a new topic, where do you study that from and how do you retain the knowledge? Can you please explain your learning approach on a high level?

@AsliEngineering Год назад

@@hc90919 if just happens. I put no explicit efforts in retaining my learnings.

@77loutube 3 года назад

how did you come up with this topic? Are you generating data for your staging 😀 piez?

@AsliEngineering 3 года назад

hehehehe :) Guruji :) I was trying out building Udemy's taxonomy and testing out my SQL efficiency. The very use case discussed in the video.

@Platoface 4 месяца назад

I think you called me today about 20 times from 20 different numbers. Jk, I appreciate your video. I’m needing to find faster query’s in a table of only 400k records. Problem is I have 30 union alls to aggregate data for 30 locations. The YTD for each month kills performance. Kills it….

@pawarkishor 3 года назад

Hi @Arpit Bhayani, I was just wondering, what made you create a single table for 3 separate entities? What thoughts go behind that?

@AsliEngineering 3 года назад

Extensibility. Introducing the 4th type in the hierarchy will require us to create a new table. How about scaling our hierarchy from to a depth of 10, 20, 50; if we create a new table for every type in our hierarchy, we will have to create those many tables. which is not feasible. Another rationale behind this decision is the schema. Every single type of topic has the same schema, same set of indexes, putting them in diff table did not make sense. How about we want to skip a level for some topic. Say a topic instead of having a parent as "sub-category" could have directly "category" as a parent. This is not easily doable by having 3 tables.

@pawarkishor 3 года назад

@@AsliEngineering awesome. Thanks.

@koushikr1361 2 года назад

Can anyone explain Why at the end of the video, the number of rows inserted are 5,000,000 and not 5,005,050 ?

@marcusaureliusfanboy 3 года назад

@ Arpit Bhayani is this mysql workbench?

@AsliEngineering 3 года назад

DataGrip.

@RohanChouguleTechEd 3 года назад

Just wondering, what made MySQL pick the ordering of digits at 24:43? I.e. 090, 190, 290 etc instead of just 000, 001, 002 ... 999?

@AsliEngineering 3 года назад

Even I was wondering that. An interesting pattern was that the order was again 0-9 when we went from 2 tables to 3. It will be fun to find that out. Go for it 😁

@ankushsingh0 3 года назад

@@AsliEngineering Interesting thing. Till 5 digits, it followed a pattern and then just random order for 6,7 & more digit counter. select concat(d1.id) from digit d1; select concat(d1.id, d2.id) from digit d1, digit d2; select concat(d1.id, d2.id, d3.id) from digit d1, digit d2, digit d3; select concat(d1.id, d2.id, d3.id, d4.id) from digit d1, digit d2, digit d3, digit d4; select concat(d1.id, d2.id, d3.id, d4.id, d5.id) from digit d1, digit d2, digit d3, digit d4, digit d5; select concat(d1.id, d2.id, d3.id, d4.id, d5.id, d6.id) from digit d1, digit d2, digit d3, digit d4, digit d5, digit d6; select concat(d1.id, d2.id, d3.id, d4.id, d5.id, d6.id, d7.id) from digit d1, digit d2, digit d3, digit d4, digit d5, digit d6, digit d7; select concat(d1.id, d2.id, d3.id, d4.id, d5.id, d6.id, d7.id, d8.id) from digit d1, digit d2, digit d3, digit d4, digit d5, digit d6, digit d7, digit d8;

@AsliEngineering 3 года назад

@@ankushsingh0 Time to dive into SQL internals it seems :)

@neerajkunturkar4888 3 года назад

I don't think MySQL or any RDB in general gurantees any specific ordering to data selected inherently unless specified in query. Mostly it is version/database-engine and platform dependant. For ids int/long most of the engines follow natural order of data. Though this based on my experience, have to check specs though.

@ankushsingh0 3 года назад

@@AsliEngineering I am sure we are not done with Python internals yet. :)

@adilkhatri7475 3 года назад

hindi mai kardo na video next time se pls if possible!!

@AsliEngineering 3 года назад

I thought of creating content in Hindi, but fir demographic restrict ho jaati hai. Sorry mate abhi to plan nahi hai. may be sometime in the future. Mai bilkul agree karta hu ki Handi me hota to jyada maza aati aur muje definitely bahot jyada maza aata padhane me, lekin abhi nahi kar sakta. Sorry for this.