Andy Pavlo (www.cs.cmu.edu...) Slides: 15445.courses.... Notes 15445.courses.... 15-445/645 Intro to Database Systems (Fall 2022) Carnegie Mellon University 15445.courses....
I've used and coded for IMS, and it still has utility where it's used - definitely not "crap from that time". It is highly probable that the system that keeps the professor's checking account uses IMS rather than an RDBMS, and that is due to unmatched reliability of these systems.
What was the deal with the blockchain kook who interrupted and went on a rant and then stormed out at around 25 minutes in? Was that a student, or just some random person?
Thank you so much. I don't have a CS degree, but I have self-learned and now I am a developer who needs to pick up fundamental CS knowledge. You have just saved my life. Thank you.
I am now doing my Master's and during my studies I already visited two database lectures. Still I am learning a lot of new concepts here (or known concepts in much greater detail)
Fun fact: E. F. "Ted" Codd, aka the Coddfather, invented the Relational Model (relations, tuples, domains; primary key, foreign key), but also the first query language for his model (ALPHA), the term "data model" and the term OLAP. He was also highly critical of SQL (calling it Fatally Flawed back in 1985) because it broke a bunch of consistency, which STILL hasn't really been fixed (like allowing duplicate rows, and returning anything thats not a relation (like a single row, a column or a single scalar/cell value). I've read all the publicly available letters he wrote BTW. Good stuff. Even his criticisms on the Entity-Relation Model from the 1976 (?) by Peter Chen, IIRC.
can't believe CMU has to deal with hallucinating crypto fanatics. I guess bad ideas can take hold of otherwise intelligent people, but it's hard to understand how
im very lucky i found this course.. I have been struggling to find a good course like this on the internet.. thank you very much..! you have no idea how this helps to college dropouts like myself..
This class is amazing. The course at my Univ is to spend one lecture on SQL and then they throw it away and focus on EER or other database theories. 3 or 4 weeks later, suddenly they come back and start looking at relational algebra. I dont like it. I even got the wrong impression that relation algebra was invented later than SQL ... It's always good to introduce closely related concepts together, like this class does, unless it is absolutely necessary to break them apart. Thank you Andy. Awesome content!
Thanks professor for making these available publicly. Really appreciate this. Is there any way to reduce the size of class recording so that the slides are not cut off from the video.
It's great to see Andy on the video! You can always download slides from the course site, if there's something important you can't see on the video. Well, maybe it would be alright to cut the classroom video at the top a bit, as there's not much useful stuff going on there.
To those in the comments who might be more knowledgable - is this course still useful for those who do not know C++ - I know some SQL and I'm learning python but C++ looks a little too advanced for me to get into - can these database systems be written in another language or is c++ the most commonly used in industry? I think understanding the algorithms and tables/index types would be useful but I don't know enough about DE to understanding if it would be relevant without the C++ background.
You can spend a few days learning some basic modern C++(note: Modern C++, which has some “smart pointer” things) and then see if you can complete . The following is my personal opinion, which may be incorrect: this course is about "implementation", you need to dity your hands for better understanding, and, dbs is a system, not some piecemeal algorithms, it's a combination of algorithms, data structures, and some OS concepts(concurrent, storage...). So you may need a basic project to assist you in grasping such a huge system. This course offers us the ”bustub“, it is written in c++, so c++ knowledge is required.
As long as the language you use is Turing Complete (which both cpp and Python are), then you can program anything in either language. The Python version is likely to be slower, but that's fine if you just want to learn.
46:00 It's confusing how in SQL, SELECT do selection for attributes, While SELECT definition in in Relational Algebra is to do selection for tuples instead.
Oh gods, SQL's natural join compares the NAMES of the columns? That's awful and another point of evidence why SQL Relational Model. It's why Codd hammered on the idea of using shared Domains to join on, not shared column names. SQL, what a joke! 😂
Yes, in the Relational Model there are no duplicates within any single relation, and if you join two relations the result is a new relation which as any other relation does not contain duplicate rows. That's why there is no popular RDBMS in existence, since Postgres, DB2, Oracle, etc all allow duplicate rows and thus are not truly relational.
39:00 i think the difference becomes important in replication. Given many-many relationship realised on array column, one would need to violate fk constraint. With itermediate table, one can copy records row by row.
You're right, a blockchain is ultimately just a system - it's the distributed ledger mechanism that provides certain properties like immutability, decentralization, and consensus. But in order to actually store and transact data on the blockchain, you need a few additional components: 1. A data model - This defines the structure and relationships of the data you want to store on the blockchain. For Bitcoin, the data model defines transactions, blocks, addresses, etc. For Ethereum, the data model includes accounts, smart contracts, tokens, etc. The data model determines how data elements relate to each other and the rules around transacting with the data. 2. A format for representing the data - Things like JSON, XML, CSV, etc. The format determines how the data elements defined in the data model are encoded into strings that can be stored on the blockchain. 3. APIs and interfaces - These provide a way for users and applications to read and write data to the blockchain. For example, Bitcoin has APIs to create transactions, get wallet balances, etc. Ethereum has APIs for deploying and executing smart contracts. 4. Consensus rules - The consensus algorithm, like proof-of-work or proof-of-stake, maintains agreement between nodes about the state of the data and ensures only valid transactions/data are recorded on the blockchain. 5. Node software - The blockchain client software that implements the data model, formats, APIs, consensus rules, and runs on the nodes that maintain the network. For Bitcoin, this is reference implementations like Bitcoin Core. For Ethereum, it's clients like Geth and Parity. So you're right that a blockchain alone is just a distributed ledger mechanism. All of these additional components - the data model, formats, interfaces, rules, and node software - build on top of the blockchain and are needed to actually implement a usable ledger system, whether it's for recording transactions, smart contracts, identity data, or anything else. The blockchain provides the foundation, but you need to construct a lot on top of it!