Processors Are Awesome - Superscalar 8-Bit CPU #1

Fabian Schuiki

Подписаться 3,1 тыс.

Просмотров 10 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

21 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 71

@Rowlesisgay 6 месяцев назад

I felt bad five minutes in that I hadn't yet liked, subscribed, and ringed that bell, and queued up the rest of the playlist, so i got on that. OH WAIT THERES 34 VIDEOS WHOOA

@fabianschuiki 6 месяцев назад

😃 And more are coming! 🙂

@Rowlesisgay 6 месяцев назад

@@fabianschuiki I'm so excited its going great so far. I always viewed superscaling as the edge of understandably for people who don't work at intel or something, and out-of-order-execution as just beyond that, you can get the principle, but as a normal person i'd never be able to implement this. Hopefully seeing you do so wont make me go insane and either throw all my money at random IC's and breadboards and an overprices oscilloscope or more likely and less destructively just steal my time and make me write an emulator, but hopefully it will just help me understand.

@fabianschuiki 6 месяцев назад

@Rowlesisgay 😃 I hope it will all work out. It should in theory, but you never know if some breadboard or PCB will just randomly decide to catch fire. But it should be possible to get some decent OoO and superacalar execution going even with a very simple 8 bit design. Focusing more on how it works rather than the complexity of doing it for 64 bits at 5 GHz.

@Rowlesisgay 6 месяцев назад

@@fabianschuiki I always have trouble believing my cpu can go at multiple GHz, like, that's wifi frequency (my cpu boosts past 4), that's so high frequency it should make microwaves shoot out of the PCB, how can normal PCB material carry the signals, and logic circuits have any ability to make digital out of the analog wobbling everything is in reality. I have, however, learned that when it comes to computer science, if there's an idea that makes no sense how it would be more practical than sticking with what you've got even if its better under intensive load, the industry obviously switched to it in the 90s, or the 00s for outliers like multi-core cpus.

@harshitjoshi3082 Год назад

The best playlist doesnt exis... the best playlist:

@fabianschuiki Год назад

😂

@fluffy_tail4365 17 дней назад

holy shit I was looking for a video like this since forever. I have the concepts for a basic cpu but finding information on modern superscalar concepts is a mess of navigating wikipedia pages and random sources, having such an exemplary series is university course worthwhile! Thanks for making this, ringed da bell, commented liked hoping this finds more people lookiong for this

@fabianschuiki 16 дней назад

Thank you so much 😃!

@GordonWrigley 2 года назад

This series has woefully few views for the quality. Don't fret that, keep at it eventually it will get the attention it deserves.

@mrhidetf2 2 года назад

100%

@HeadCodeMonkey82 Год назад

I have been obsessed with DIY CPUs for years now, I am shocked it took the algorithm this long to surface this series to me!

@markw1263 Год назад

@@HeadCodeMonkey82 same😆

@fabianschuiki Год назад

Thanks! 🙂

@bitlong4669 4 месяца назад

Right?! I love these type of builds. Looking forward to the series.

@davidrosset4457 Год назад

Awesome, please keep up with this series forever.

@santamarialangomauricio2347 2 года назад

So exited to watch the full series!!

@TheMason76 Год назад

Ich bin soooo gespannt auf diese Reise. Ich kann es kaum abwarten das nächste Video #25 , #26, ... #70 zu sehen. ☺ I am soo exicted about this journey. I can't wait to see the next Video #25, #26, ... #70 ☺ 👍👍👍👍

@yqqbey8779 Год назад

RU-vid have just recommended me this series and building a superscalar architecture sounds really fascinating!

@williambrasky3891 5 месяцев назад

Amazing. I just got recommended one of your videos. As others have commented, at the moment, your views are criminally low, especially considering their high quality. Hopefully the recommendation I just got is a sign that’s about to change. Thank you for creating this!

@fabianschuiki 5 месяцев назад

Thank you for the kind words 🙂! It's an interesting journey. But I also appreciate the videos as just documentation of the build itself. So it's worthwhile either way 😉

@dparson 11 месяцев назад

Wow, this is incredible, thanks for starting such an ambitious series!

@fabianschuiki 11 месяцев назад

Thanks! 🙂

@cmdcs1 Год назад

Amazing video series, I can't believe it took me so long to discover it! 🙂

@fabianschuiki Год назад

Thanks! 😃

@LaSchui 2 года назад

you're a genius bro!

@scome98 6 месяцев назад

Just found this series after watching a few others. Really looking forward to it. I'm in the same process, building my own processor to power a over the top chicken coop. Was initially thinking about using the ESP32 platform, but this seems a lot more fun, challenging and I have an entire wall to display all the boards that will be running. Thanks for documenting this.

@fabianschuiki 6 месяцев назад

That sounds like a very cool project 😎!

@fernandoalex. Год назад

looks like I just discovered a gem, great video

@arfr1043 2 года назад

Amazing! Your chanel is definitely underrated. Keep up the great work!

@jamesh318 Год назад

Watched bits and pieces so far and this is just so great. Ben Eater eat your heart out! Seriously I loved Ben’s work and learned so much and this is the next level.

@fabianschuiki Год назад

Thanks!

@lawrencemanning 4 месяца назад

Can’t believe I haven’t found your videos before! Very very cool. 5:29 surprised not to see basic pipelining (one instruction issue per clock, multiple clocks to complete an instruction) here, since it’s the logical next step after concluding your clock rate won’t scale with every instruction needing to execute in a single clock cycle. Anyway, great stuff!

@fabianschuiki 4 месяца назад

Thanks! 😃 Basic pipelining will definitely come into the picture.

@harshitjoshi3082 Год назад

The content is very good and the videos are neat and polished ! I am amazed at both your work and the extreme lack of views for such high quality content. Expressing my gratitude for your work ! This is gonna blow up some day ! Best of Luck Mate !

@fabianschuiki Год назад

Thank you very much 🙂!

@Dr_Mario2007 Год назад

Subbed as I kinda wanted to tackle the VLIW processor based on RISC-V, and I kinda wanted to know exactly how the out-of-order execution scheduler is put together, which is the most important black box to me so I can figure out where and how to push it - VLIW fetch and compaction stage would be separately responsible for superscalar issues internally, of course. Thanks for putting together the playlist for this processor, as it's a very useful and kind of hard to find information.

@fabianschuiki Год назад

Thanks! Happy to hear it's useful 🙂. Can't wait to finally get to the out-of-order pieces.

@maplinxxgd5234 4 дня назад

holy shit this is exactly what I was looking to do

@fabianschuiki 4 дня назад

Cool 😎

@derekchristenson5711 Год назад

We studied CPU architecture in a couple of my CS classes in college, so I'm looking forward to seeing what you do (did?) with them! 🙂

@fabianschuiki Год назад

Thanks! 🙂 I'm trying to push forward to get to the meat of the whole out-of-order execution 👍

@Laétudiante 5 месяцев назад

Hi I came across this video and realized that you are one of the authors for Snitch! ps: I just did my master thesis on custom ISA extension on Snitch

@fabianschuiki 5 месяцев назад

Oh how cool is that! 🙂 What kind of ISA extensions did you develop?

@Laétudiante 5 месяцев назад

@@fabianschuiki The extensions are built on floating point instructions, specifically used to enable HPC Monte Carlo Simulations, with these instructions I was able to take advantage of pseudo-dual issue 🙂

@fabianschuiki 5 месяцев назад

That sounds very cool 😎

@JoseGustavoAbreuMurta Год назад

Your projects are very interesting and educational. Congratulations. Do you know the hex display (4 bits)? It could be very interesting for your projects. IBM used it in the 90' Mainframes. For example the old HP 5082-7340 or the current HDSP-0772 (very expensive).

@fabianschuiki Год назад

Thanks! 🙂 Great tip about the hex display, that looks very useful! 👍

@codebitman 2 года назад

Topic is very interesting. Do you recommend some books, internet material, free internet courses? Best regards

@fabianschuiki 2 года назад

In terms of RU-vid videos and tutorials, I'd recommend: - Ben Eater's 8 bit breadboard CPU build (minimalist, simple-as-possible): ru-vid.com/group/PLowKtXNTBypGqImE405J2565dvjafglHU - James Sharman's 8 bit breadboard CPU build (pipelined, a bit more involved): ru-vid.com/group/PLFhc0MFC8MiCDOh3cGFji3qQfXziB9yOw If you're looking for the full CPU design treatment: - Computer Architecture: A Quantitative Approach, by Hennessy & Patterson: www.google.com/books/edition/Computer_Architecture/MBQFuAEACAAJ?hl=en For computer architecture in general, the RISC-V instruction set is a very good starting point to get your hands dirty. It's very clean and elegantly designed, and it lacks some of the baggage that has accumulated with other ISAs over the years.

@codebitman 2 года назад

@@fabianschuiki Thanks for response. I don't understand why you have so little subscribers. I wish you more subscriptions

@ArneChristianRosenfeldt Год назад

I don’t get why “out of order” involves a “look ahead” . I thought we just have our parallel decode and fetch circuit, which is simply a vector unit because instructions have a fixed length. So this won’t stall. But execution may need to wait for the score board. So out of order simply means that we don’t stall for these. We just mark their result register on the scoreboard also and go on. The trick is to fetch4 instructions at once and do the score board, but only have 2 ALUs. So fetch runs ahead automatically. There is no extra look ahead circuitry. Why do we care for register renaming? 32 names are enough. Let’s have a separate stack pointer and instruction pointer. Only use I see is that we could eliminate the write back register for reg-reg .

@fabianschuiki Год назад

Yes you're totally right. The look-ahead isn't something you'd add explicitly, but it's a side effect of being able to decode and issue instructions even if their inputs aren't available yet. As you say, if you fetch faster than you can execute, or if registers have to wait for results, this effectively looks like the processor is looking ahead of currently stalled instructions to find work it can already do. No additional circuitry needed indeed! 👍

@ArneChristianRosenfeldt Год назад

I’d like to add that vector fetch and decode will need to be followed by a scoreboard stage similar to sprite priority on the C64. So instructions ( their sources) on the “right” hand side in the vector will be blocked by target register names on the “left”

@Artoooooor 6 дней назад

Next decade: 8-bit supercomputer on breadboard

@fabianschuiki 6 дней назад

It does indeed take a while to develop 🙂. But the speed also picks up once the initial bits and pieces are out of the way 👍

@Chakamatics 2 года назад

Can you recommend any learning material on out of order execution specifically?

@fabianschuiki 2 года назад

The Hennessy-Patterson book on computer architecture should cover that in a fair amount of detail if I recall correctly: - Computer Architecture: A Quantitative Approach, by Hennessy & Patterson: www.google.com/books/edition/Computer_Architecture/MBQFuAEACAAJ?hl=en Also in general you should be able to dig up a lot of interesting resources if you Google for "Tomasulo's algorithm". That was one of the initial implementations of out-of-order and superscalar execution. Most modern architectures are using modified flavors of this original idea.

@Chakamatics 2 года назад

@@fabianschuiki thanks a lot

@ArneChristianRosenfeldt Год назад

I tried to look up reservation stations in Wikipedia. I don’t get it. I expect a matrix where every unit ( ALU, barrel shifter, DIV, Load ) outputs it’s result on its row together with the name of the destination register. On the columns ever unit ( Store, CMP, Test ) waits on input. The cells compare register names.

@fabianschuiki Год назад

They come in different forms. The earlier ones used to store the operation to be executed, with additional fields for the operands. If a result wasn't available when the instruction was put in the reservation station, that field would be set to an ID that identifies the result that still has to be produced by another functional unit. When a result has been computed, it would be sent over a bus alongside its ID. The result gets stored in the register file, but all reservation stations also observe the ID and check if it matches one of the entries (like your cells). If it does, that entry would be replaced by the result. Once all entries have all their operands present, the instruction would be executed. Some newer variants of the technique put all of the information into the reorder buffer, in a more centralized manner.

@ArneChristianRosenfeldt Год назад

@@fabianschuiki buffer means access conflicts if superscalar. Latency due to the indirection between the execution logic and the buffer ( opcode which needs another decode). This only makes sense for scalar RISC CPU which has to orchestrate external hardware like vector processors (GPU, Cray ), large memory banks ( servers, main frame), slow EEPROM (embedded, console with cartridge)

@GreenNotebookGaming 10 месяцев назад

I don't have the stuff to do this in real life, but I was wondering, if I would do it digitally, should I use the game "Turing Complete", or the logic sim "Digital" to do this?

@fabianschuiki 10 месяцев назад

Digital should be a good platform to experiment with stuff like this! Pretty sure that there are also some videos on RU-vid that discuss the topic specifically with Digital as the underlying sim 🙂

@GreenNotebookGaming 4 месяца назад

@@fabianschuiki Sorry for the late (5 months late) reply. It worked in Digital.

@fabianschuiki 4 месяца назад

@GreenNotebookGaming That's great to hear! 😃🥳

@patrickrainbolt Год назад

What are you using for your power supply?

@fabianschuiki Год назад

Currently a simple USB cable. I suspect that will work for quite some time before I need to switch to a dedicated supply brick.

@patrickrainbolt Год назад

@@fabianschuiki don't suppose you would talk about it sometime. I am building one but am a little out of my knowledge base.

@editxswajal 9 месяцев назад

where can i learn about these kind of low level things about computers so i can also build my own custom cpu ? is anyone have any suggetions?thankyou.

@fabianschuiki 9 месяцев назад

There are a few very nice series here on RU-vid where people build custom CPUs. (Check out Ben Eater's 8-bit CPU series and James Sharman's 8-bit pipelined CPU series, for example.) Ben Eater also has component lists and kits that allow you to build his 8-bit CPU, and follow along with his 6502 build.