Matt Butrovich (mattbutrovi.ch/) Slides: 15445.courses.cs.cmu.edu/fall... Notes: 15445.courses.cs.cmu.edu/fall... 15-445/645 Intro to Database Systems (Fall 2023) Carnegie Mellon University 15445.courses.cs.cmu.edu/fall...
Still confused about the part for the case of a simple block nested loop join (no index), why wouldn't I want to reserve as much BP space as possible for my inner table ? the inner table is the going to be iterated over and over but for the sequentially scanned outer table we just need one page which can be removed easily with the next because it's not going to be accessed anymore, why keep cold data in the BP ? ...
I think it is because each iteration of the outer loop (for each block in R) needs to scan all N blocks of the inner table. If the inner table is too large to fit in memory, for each iteration, we always need to load N blocks into BP. So we need to reduce the number of iterations in the outer loop, from M to ceil(M/(B-2)). Hope it is clear.
Simply do math for each case. We multiply the number of readings of the inner table blocks by the number of readings of the outer table blocks ("read some blocks from the outer table into memory, make a scan of inner table, find tuples to satisfy condition; read into memory, make a scan, find tuples, read.." and so on). Therefore, the more outer table blocks we put in memory, the better it will be (fewer inner table scans will need to be done). Ideally, we would like the outer table to fit completely into memory, and then we will read the outer table once and the inner table once.
I think it's pretty good, and the instructor's pronunciation is very standard and clear. As a Chinese student, I feel that this course allows me to learn about DBMS while also practicing my English listening skills, since both Andy and this teacher speak at a relatively fast pace.